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Carotenoid Biosynthesis 

TECHNICAL FIELD 

The invention relates to methods and materials for producing carotenoids, and in 
particular, to nucleic acid molecules, polypeptides, host cells, and methods that can be 
used for producing carotenoids. 

BACKGROUND 

Astaxanthin ( 3,3'-dihydroxy-P,p-carotene-4,4'-dione) is the primary carotenoid 
tha , imparts the pit* pigtnen. to the eggs, flesh, and skin of Sainton, trout, and shnmp. 
Mos, aninrais cannot synthesize carotenoids. Rather, the pigments are acquired through 
the food chain from marine algae and phytoplankton, the primary producers of 
astaxanthin. ATX exists in three configuration^ isomers [(3S, 3'S), (3R, 3'R) and (3S, 
3'R- 3R 3'S)] however, ATX is found in the marine environment only m the (3S, 3 S) 
form Consequently, this form is considered the natutal and mos. desirable form of ATX. 

Although astaxanlhin has been commercially extracted from some yeast and 
Crustacea species and has been chemically synthesized as a 1:2:1 mixture of the (3S,3'S>, 
(3S 3'R)- and (3R,3'R)-,somers, astaxanthin is limited in availability and is expense to 
purchase. See, Torrisen et al. (1989) Cris^austieSeh 1 :209; and Mayer (1994) 
^^Si^, 66:93 1-938. Thus, mere is a need for a less expensive source of the 
naturally-occurring (3S,3'S) astaxanthin. 

SUMMARY 

The invention is based on methods and materials for producing oarotenoids such 
as lycopene, zeaxanthhr, zeaxanthin diglucoside, eanthaxanflun, B-carotene, lutein, and 
astaxanthm. Such carotenoids can be used as nutritional supplements in humans and can 
be formulated for use in aquaculture or as an animal feed. The invention provides nucle.c 
aeid molecules that can be used to engineer host cells having the ability to produce 
particular carotenoids and polypeptides that can be used in cell-free systems to make 
particular carotenoids. The engineered cells described herein can be used to produce 
large quantities of carotenoids. 
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In one aspect, the invention features an isolated nucleic acid having at least 76% 
sequence identity to the nucleotide sequence of SEQ ID NO:l (e.g., at least 80% 3 85%, 
90%, or 95% sequence identity to the nucleotide sequence of SEQ ID NO:l) or to a 
fragment of SEQ ID NO:l at least 33 contiguous nucleotides in length. An isolated 
5 nucleic acid can encode a zeaxanthin glucosyl transferase polypeptide at least 75% 

identical to the amino acid sequence of SEQ ID NO:2. Expression vectors containing 
such nucleic acids operably linked to an expression control element also are featured. 

In another aspect, the invention features an isolated nucleic acid having at least 
78% sequence identity to the nucleotide sequence of SEQ ID NO:3 (e.g., at least 80%, 

10 85%, 90%o, or 95% sequence identity to the nucleotide sequence of SEQ ID NO:3) or to a 
fragment of SEQ ID NO:3 at least 32 contiguous nucleotides in length. An isolated 
nucleic acid can encode a lycopene p-cyclase polypeptide at least 83%) identical to the 
amino acid sequence of SEQ ID NO:4. p-carotene can be made by contacting lycopene 
with a polypeptide encoded by such isolated nucleic acids. The invention also features an 

15 expression vector that includes such nucleic acids operably linked to an expression 
control element. 

In yet another aspect, the invention features an isolated nucleic acid having at least 
81%> sequence identity to the nucleotide sequence of SEQ ID NO:5 (e.g., at least 85%, 
90%o, or 95% sequence identity to the nucleotide sequence of SEQ ID NO:5) or to a 

20 fragment of SEQ ID NO:5 at least 60 contiguous nucleotides in length. An isolated 
nucleic acid also can encode a geranylgeranyl pyrophosphate synthase polypeptide at 
least 85%) identical to the amino acid sequence of SEQ ID NO:6. Geranylgeranyl 
pyrophosphate can be made by contacting farnesyl pyrophosphate and isopentenyl 
pyrophosphate with a polypeptide encoded by such nucleic acids. Expression vectors that 

25 include such nucleic acids operably linked to an expression control element also are 
featured. 

Isolated nucleic acids having at least 82%) sequence identity to the nucleotide 
sequence of SEQ ID NO:7 (e.g., at least 85%), 90%, or 95% sequence identity to the 
nucleotide sequence of SEQ ID NO:7) or to a fragment of SEQ ID NO:7 at least 30 
30 contiguous nucleotides in length also are featured. An isolated nucleic acid also can 
encode a phytoene desaturase polypeptide at least 90%> identical to the amino acid 
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sequence of SEQ ID NO:8. Lycopene can be made by contacting phytoene with a 
polypeptide encoded by such nucleic acids. An expression vector that includes such 
nucleic acids operably linked to an expression control element also is featured. 

The invention also features an isolated nucleic acid having at least 82% sequence 
identity to the nucleotide sequence of SEQ ID NO:9 (e.g., at least 85%, 90%, or 95% 
sequence identity to the nucleotide sequence of SEQ ID NO:9) or to a fragment of SEQ 
ID NO:9 at least 23 contiguous nucleotides in length. An isolated nucleic acid also can 
encode a phytoene synthase polypeptide at least 89% identical to the amino acid sequence 
of SEQ ID NO: 10. Phytoene can be made by contacting geranylgeranyl pyrophosphate 
with a polypeptide encoded by such nucleic acids. An expression vector that includes 
such nucleic acids operably linked to an expression control element also is featured. 

In yet another aspect, the invention features an isolated nucleic acid having at least 
85% sequence identity to the nucleotide sequence of SEQ ID NO: 11 (e.g., at least 90% or 
95% identity to the nucleotide sequence of SEQ ID NO: 1 1) or to a fragment of SEQ ID 
NO: 1 1 at least 36 contiguous nucleotides in length. An isolated nucleic acid can encode a 
P-carotene hydroxylase polypeptide at least 90% identical to the amino acid sequence of 
SEQ ID NO: 12. Zeaxanthin can be made by contacting P-carotene with a polypeptide 
encoded by such nucleic acids. Astaxanthin can be made by contacting canthaxanthin 
with a polypeptide encoded by such nucleic acids. The invention also features an 
expression vector that includes such nucleic acids operably linked to an expression 
control element. 

The invention also features membranous bacteria (e.g., a Rhodobacter species) 
that include at least one exogenous nucleic acid encoding phytoene desaturase, lycopene 
P-cyclase, P-carotene hydroxylase, and p-carotene C4 oxygenase, wherein expression of 
the at least one exogenous nucleic acid produces detectable amounts of astaxanthin in the 
membranous bacteria. The amino acid sequence of the phytoene desaturase can be at 
least 90% identical to the amino acid sequence of SEQ ID NO:8. The amino acid 
sequence of the lycopene P-cyclase can be at least 83% identical to the amino acid 
sequence of SEQ ID NO:4. The amino acid sequence of the p-carotene hydroxylase can 
be at least 90% identical to the amino acid sequence of SEQ ID NO:12. The amino acid 
sequence of the p-carotene C4 oxygenase can be at least 80% identical to the amino acid 
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sequence of SEQ ID NO:39. The membranous bacteria further can include an exogenous 
nucleic acid encoding geranylgeranyl pyrophosphate synthase (e.g., a multifunctional 
geranylgeranyl pyrophosphate synthase) or can lack endogenous bacteriochlorophyll 
biosynthesis. The multifunctional geranylgeranyl pyrophosphate synthase can have an 

5 amino acid sequence at least 90% identical to the amino acid sequence of SEQ ID NO:45. 
The membranous bacteria further can include an exogenous nucleic acid encoding 
phytoene synthase. The phytoene synthase can have an amino acid sequence at least 89% 
identical to the amino acid sequence of SEQ ID NO: 10. 

In another aspect, the invention features membranous bacteria that include an 

10 exogenous nucleic acid encoding a phytoene desaturase having an amino acid sequence at 
least 90% identical to the amino acid sequence of SEQ ID NO:8, and wherein the 
membranous bacteria produces detectable amounts of lycopene. The membranous 
bacteria further can include a lycopene P-cyclase, wherein the membranous bacteria 
produce detectable amounts of P-carotene. The membranous bacteria also can include a 

1 5 P-carotene hydroxylase, wherein the membranous bacteria produce detectable amounts of 
zeaxanthin. 

In still yet another aspect, the invention feature membranous bacteria that include 
at least one exogenous nucleic acid encoding phytoene desaturase, lycopene P-cyclase, 
and P-carotene C4 oxygenase, wherein expression of the at least one exogenous nucleic 

20 acid produces detectable amounts of canthaxanthin in the membranous bacteria. The 
membranous bacteria also can include a P-carotene hydroxylase, wherein the 
membranous bacteria produce detectable amounts of astaxanthin. 

The invention also features a composition that includes an engineered 
Rhodobacter cell, wherein the cell produces a detectable amount of astaxanthin or 

25 canthaxanthin. The engineered Rhodobacter cell can include at least one exogenous 

nucleic acid encoding phytoene desaturase, lycopene p-cyclase, p-carotene hydroxylase, 
and P-carotene C4 oxygenase. The composition can be formulated for aquaculture and 
can pigment the flesh of fish or the carapace of crustaceans after ingestion. The 
composition can be formulated for human consumption or as an animal feed (e.g., 

30 formulated for consumption by chickens, turkeys, cattle, swine, or sheep). 
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The invention also features a method of making a nutraceutical. The method 
includes extracting carcinoids from an engineered Rhodobacter cell, the engineered 
Rhodobacter cell including at least one exogenous nucleic acid encoding phytoene 
desaturase, lycopene ^-cyclase, P-carotene hydroxylase, and ^-carotene C4 oxygenase, 
5 and wherein the Rhodobacter cell produces detectable amounts of astaxanthm. 

In yet another aspect, the invention features membranous bacteria, wherein the 
membranous bacteria include an exogenous nucleic acid encoding a lycopene P -cyclase 
having an amino acid sequence at least 83% identical to the amino acid sequence of SEQ 
ID NO-4 The membranous bacteria further can include a phytoene desaturase, (e.g., an 
10 exogenous phytoene desaturase), wherein the membranous bacteria produce detectable 
amounts of P -carotene. The membranous bacteria also can include a P -carotene 
hydroxylase (e.g., an exogenous P-carotene hydroxylase), wherein the bacteria produce 

detectable amounts of zeaxanthin. 

Membranous bacteria that include a (3-carotene hydroxylase having an amino acid 
15 sequence at least 90<>/o identical to the amino acid sequence of SEQ ID NO:12 also is 
featured The membranous bacteria further can include a lycopene P -cyclase (e.g., an 
exogenous lycopene P-cyclase), wherein the membranous bacteria produce detectable 
amounts of zeaxanthin. The membranous bacteria also can include a phytoene desaturase 
(e.g., an exogenous phytoene desaturase), wherein the membranous bacteria produce 
20 detectable amounts of p-carotene. 

The invention also features membranous bacteria (e.g., a Rhodobacter species) 
lacking an endogenous nucleic acid encoding a farnesyl pyrophosphate synthase, wherein 
the bacteria produces detectable amounts of carotenoids. The membranous bacteria also 
can include an exogenous nucleic acid encoding a multifunctional geranylgeranyl 

25 pyrophosphate synthase. 

In another aspect, the invention features an isolated nucleic acid having at least 
70% sequence identity (e.g., at least 80% or 90%) to the nucleotide sequences of SEQ ID 
NO:38, or to a fragment of the nucleic acid of SEQ IDNO:38 at least 1 5 contiguous 
nucleotides in length. The nucleic acid can encode a P -carotene C4 oxygenase. 

30 Canthaxanthin can be made by contacting P -carotene with a polypeptide encoded by such 
nucleic acids or a polypeptide having an amino acid sequence at least 80% identical to the 
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amino acid sequence of SEQ ID NO:39. Astaxanthin can be made by contacting 
zeaxanthin with a polypeptide encoded by such isolated nucleic acids or a polypeptide 
having an amino acid sequence at least 80% identical to the amino acid sequence of SEQ 
ID NO:39. 

5 In another aspect, the invention features membranous bacteria that include an 

exogenous nucleic acid encoding a (3-carotene C4 oxygenase, where the P-carotene 
oxygenase has an amino acid sequence at least 80% identical to the amino acid sequence 
ofSEQIDNO:39. 

In yet another aspect, the invention features a host cell comprising an exogenous 

10 nucleic acid, wherein the exogenous nucleic acid includes a nucleic acid sequence 

encoding one or more polypeptides that catalyze the formation of (3S, 3'S) astaxanthin, 
wherein the host cell produces CoQ-10 and (3S, 3'S) astaxanthin. A method of making 
CoQ-10 and (3S, 3'S) astaxanthin at substantially the same time also is featured. The 
method includes transforming a host cell with a nucleic acid, wherein the nucleic acid 

15 includes a nucleic acid sequence that encodes one or more polypeptides, wherein the 

polypeptides catalyze the formation of (3S, 3'S) astaxanthin; and culturing the host cell 
under conditions that allow for the production of (3S, 3'S) astaxanthin and CoQ-10. The 
method further can include transforming the host cell with at least one exogenous nucleic 
acid, the exogenous nucleic acid encoding one or more polypeptides, wherein the 

20 polypeptides catalyze the formation of CoQ-10. 

The invention also features isolated nucleic acid having a nucleotide sequence 
selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ 
ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO:38, and SEQ ID NO:44. 

An isolated nucleic acid having at least 90% sequence identity to the nucleotide 

25 sequences of SEQ ID NO:44, or to a fragment of the nucleic acid of SEQ ID NO:44 at 

least 60 contiguous nucleotides in length is featured. Geranylgeranyl pyrophosphate can 
be made by contacting isopentenyl pyrophosphate and dimethylallyl pyrophosphate with 
a polypeptide encoded by such a nucleic acid. 

Unless otherwise defined, all technical and scientific terms used herein have the 

30 same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Although methods and materials similar or equivalent to those 
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described herein can be used to practice the invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In case of conflict, the 
present specification, including definitions, will control. In addition, the materials, 
methods, and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the 
following detailed description, and from the claims. 

DESCRIPTION OF DRAWINGS 

FIG 1 is a schematic diagram of the biosynthetic pathway for the production of 
zeaxanthin and conversion to zeaxanthin di-glucoside. 

FIG 2 is a schematic diagram of the P. stewartii carotenoid gene operon (65 86 bp). 

FIG 3 is a chromatogram of astaxanthin production in P. stewartii: :crtW(B. 
aurantiacd). 

DETAILED DESCRIPTION 

Nucleic Acid Molecules 

The invention features isolated nucleic acids that encode enzymes involved in 
carotenoid biosynthesis. The nucleic acids of SEQ ID NO:l, 3, 5, 7, 9, and 1 1 encode 
zeaxanthin glucosyl transferase (crtX), lycopene (3-cyclase (crtY), geranylgeranyl- 
pyrophosphate synthase (crtE), phytoene desaturase (crtl), phytoene synthase (crtB) and 
P-carotene hydroxylase (crrZ), respectively. A nucleic acid of the invention can have at 
least 76% sequence identity, e.g., 78%, 80%, 85%, 90%, 95%, or 99% sequence identity, 
to the nucleic acid of SEQ ID NO: 1 , or to fragments of the nucleic acid of SEQ ID NO: 1 
that are at least about 33 nucleotides in length; at least 78% sequence identity, e.g., 80%, 
85%, 90%, 95%, or 99% sequence identity, to the nucleotide sequence of SEQ ID NO:3, 
or to fragments of the nucleic acid of SEQ ID NO:3 that are at least about 32 nucleotides 
in length; at least 81% sequence identity, e.g., 82%, 85%, 90%, 95%, or 99% sequence 
identity, to the nucleotide sequence of SEQ ID NO:5 , or to fragments of the nucleic acid 
of SEQ ID NO:5 that are at least about 60 nucleotides in length; at least 82% sequence 
identity, e.g., 83%, 85%, 90%, 95%, or 99% sequence identity, to the nucleotide 
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sequences of SEQ ID NO:7 or SEQ ID NO:9 , or to fragments of the nucleic acids of 
SEQ ID NO: 7 or SEQ ID NO: 9 that are at least about 30 or 23 nucleotides in length, 
respectively; at least 85% sequence identity, e.g., 86%, 90%, 92%, 95%, or 99% sequence 
identity, to the nucleotide sequence of SEQ ID NO:l 1 , or to fragments of the nucleic acid 
of SEQ ID NO:l 1 that are at least about 36 nucleotides in length. A nucleic acid of the 
invention can have at least 60% sequence identity, e.g., at least 65%, 70%, 75%, 80%, 
85%>, 90%, 95%, or 99% sequence identity to the nucleotide sequence of SEQ ID NO:38 
or to fragments of the nucleic acid of SEQ ID NO:38 that are at least about 15 nucleotides 
in length. Such a nucleic acid can encode a p-carotene C4 oxygenase (crtW). A nucleic 
acid of the invention also can have at least 90% identity to the nucleotide sequence set 
forth in SEQ ID NO:44 or to fragments of the nucleic acid of SEQ ID NO:44 that are at 
least about 60 nucleotides in length. Such a nucleic acid can encode a multifunctional 
geranylgeranyl pyrophosphate synthase. 

Generally, percent sequence identity is calculated by determining the number of 
matched positions in aligned nucleic acid sequences, dividing the number of matched 
positions by the total number of aligned nucleotides, and multiplying by 100. A matched 
position refers to a position in which identical nucleotides occur at the same position in 
aligned nucleic acid sequences. Percent sequence identity can be determined for any 
nucleic acid or amino acid sequence as follows. First, a nucleic acid or amino acid 
sequence is compared to the identified nucleic acid or amino acid sequence using the 
BLAST 2 Sequences (B12seq) program from the stand-alone version of BLASTZ 
containing BLASTN version 2.0.14 and BLASTP version 2.0.14. This stand-alone 
version of BLASTZ can be obtained from the University of Wisconsin library as well as 
at www.fr.com or www.ncbi.nlm.nih.gov. Instructions explaining how to use the B12seq 
program can be found in the readme file accompanying BLASTZ. 

B12seq performs a comparison between two sequences using either the BLASTN 
or BLASTP algorithm. BLASTN is used to compare nucleic acid sequences, while 
BLASTP is used to compare amino acid sequences. To compare two nucleic acid 
sequences, the options are set as follows: -i is set to a file containing the first nucleic acid 
sequence to be compared (e.g., C:\seql .txt); -j is set to a file containing the second 
nucleic acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastn; -o is set to any 
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desired file name (e.g., C:\output.txt); -q is set to -1; -r is set to 2; and all other options 
are left at their default setting. For example, the following command can be used to 
generate an output file containing a comparison between two sequences: C:\B12seq -i 
cAseql .txt -j c:\seq2.txt -p blastn -o c:\output.txt -1 -r 2. To compare two amino acid 
5 sequences, the options of Bl2seq are set as follows: -i is set to a file containing the first 
amino acid sequence to be compared (e.g., C:\seql.txt); -j is set to a file containing the 
second ammo acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set 
to any desired file name (e.g., C:\outpuUxt); and all other options are left at their default 
settino For example, the following command can be used to generate an output file 
10 containing a comparison between two ammo acid sequences: C:\B12seq-icAseql.txt-j 
cAseq? txt -p blastp -o c:\output.txt. If the target sequence shares homology with any 
portion of the identified sequence, then the designated output file will present those 
regions of homology as aligned sequences. If the target sequence does not share 
homology with any portion of the identified sequence, then the designated output file will 

1 5 not present aligned sequences. 

Once aligned, a length is determined by counting the number of consecutive 
nucleotides or amino acid residues from the target sequence presented in alignment with 
sequence from the identified sequence starting with any matched position and ending with 
any other matched position. A matched position is any position where an identical 
,0 nucleotide or amino acid residue is presented in both the target and identified sequence. 
Gaps presented in the target sequence are not counted since gaps are not nucleotides or 
amino acid residues. Likewise, gaps presented in the identified sequence are not counted 
since target sequence nucleotides or amino acid residues are counted, not nucleotides or 
amino acid residues from the identified sequence. 
25 The percent identity over a particular length is determined by counting the number 

of matched positions over that length and dividing that number by the length followed by 
multiplying the resulting value by 100. For example, if (1) a 1000 nucleotide target 
sequence is compared to the sequence set forth in SEQ ID NO:l, (2) the B12sea program 
presents 200 nucleotides from the target sequence aligned with a region of the sequence 
30 set forth in SEQ ID NO: 1 where the first and last nucleotides of that 200 nucleotide 

region are matches, and (3) the number of matches over those 200 aligned nucleotides is 
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180, then the 1000 nucleotide target sequence contains a length of 200 and a percent 

identity over that length of 90 (i.e. 180 -=- 200 * 100 = 90). 

It will be appreciated that a single nucleic acid or amino acid target sequence that 

aligns with an identified sequence can have many different lengths with each length 

5 having its own percent identity. For example, a target sequence containing a 20 

nucleotide region that aligns with an identified sequence as follows has many different 

lengths including those listed in Table 1. 

1 20 
Target Sequence: AGGTCGTGTACTGTCAGTCA (SEQ ID NO:46) 

0 ~ I II III I I II II I I I 

Identified Sequence: ACGTGGTGAACTGCCAGTGA (SEQ ID NO:47) 

TABLE 1 



Starting 


Ending 


Length 


Matched 


Percent 


Position 


Position 




Positions 


Identity 


1 


20 


20 


15 


75.0 


1 


18 


18 


14 


77.8 


1 


15 


15 


11 


73.3 


6 


20 


15 


12 


80.0 


6 


17 


12 


10 


83.3 


6 


15 


10 


8 


80.0 


8 


20 


13 


10 


76.9 


8 


16 


9 


7 


77.8 



15 It is noted that the percent identity value is rounded to the nearest tenth. For example, 

78.1 1, 78.12, 78.13, and 78.14 is rounded down to 78.1, while 78.15, 78.16, 78.17, 78.18, 
and 78. 19 is rounded up to 78.2. It is also noted that the length value will always be an 
integer. 

Isolated nucleic acid molecules of the invention are at least about 20 nucleotides 
20 in length. For example, the nucleic acid molecule can be about 20-30, 22-32, 33-50, 34 to 
45, 40-50, 60-80, 62 to 92, 50-100, or greater than 150 nucleotides in length, e.g., 200- 
300, 300-500, or 500-1000 nucleotides in length. Such fragments, whether protein- 
encoding or not, can be used as probes, primers, and diagnostic reagents. In some 
embodiments, the isolated nucleic acid molecules encode a ftill-length zeaxanthin 
25 glucosyl transferase, lycopene P-cyclase, geranylgeranyl pyrophosphate synthase, 
phytoene desaturase, P-carotene hydroxylase, P-carotene C4 oxygenase, or 
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multifunctional geranylgeranyl pyrophosphate synthase polypeptide. Nucleic acid 
molecules can be DNA or RNA, linear or circular, and in sense or antisense orientation. 

Isolated nucleic acid molecules of the invention can be produced by standard 
techniques. As used herein, "isolated" refers to a sequence corresponding to part or all of 
a gene encoding a zeaxanthin glucosyl transferase, lycopene p-cyclase, geranylgeranyl- 
pyrophosphate synthase, phytoene desaturase, phytoene synthase, p-carotene 
hydroxylase, p-carotene C4 oxygenase, or multifunctional geranylgeranyl pyrophosphate 
synthase polypeptide, or an operon encoding two or more such polypeptides, but free of 
sequences that normally flank one or both sides of the wild-type gene or the operon in a 
naturally-occurring genome, e.g., a bacterial genome. The term "isolated" as used herein 
with respect to nucleic acids also includes any non-naturally-occurring nucleic acid 
sequence since such non-naturally-occurring sequences are not found in nature and do not 
have immediately contiguous sequences in a naturally-occurring genome. 

An isolated nucleic acid can be, for example, a DNA molecule, provided one of 
the nucleic acid sequences normally found immediately flanking that DNA molecule in a 
naturally-occurring genome is removed or absent. Thus, an isolated nucleic acid 
includes, without limitation, a DNA molecule that exists as a separate molecule (e.g., a 
cDNA or genomic DNA fragment produced by PCR or restriction endonuclease 
treatment) independent of other sequences as well as recombinant DNA that is 
incorporated into a vector, an autonomously replicating plasmid, a virus (e.g., a 
retrovirus, adenovirus, or herpes virus), or into the genomic DNA of a prokaryote or 
eukaryote. In addition, an isolated nucleic acid can include an engineered nucleic acid 
such as a recombinant DNA molecule that is part of a hybrid or fusion nucleic acid. A 
nucleic acid existing among hundreds to millions of other nucleic acids within, for 
example, cDNA libraries or genomic libraries, or gel slices containing a genomic DNA 
restriction digest, is not to be considered an isolated nucleic acid. 

Isolated nucleic acids within the scope of the invention can be obtained using any 
method including, without limitation, common molecular cloning and chemical nucleic 
acid synthesis techniques. For example, polymerase chain reaction (PCR) techniques can 
be used to obtain an isolated nucleic acid containing a nucleic acid sequence sharing 
identity with the sequences set forth in SEQ ID NOs: 1, 3, 5, 7, 9, 1 1, 38, or 44. PCR 
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refers to a procedure or technique in which target nucleic acids are amplified. Sequence 
information from the ends of the region of interest or beyond typically is employed to 
design oligonucleotide primers that are identical in sequence to opposite strands of the 
template to be amplified. PCR can be used to amplify specific sequences from DNA as 
well as RNA, including sequences from total genomic DNA or total cellular RNA. 
Primers are typically 14 to 40 nucleotides in length, but can range from 10 nucleotides to 
hundreds of nucleotides in length. General PCR techniques are described, for example in 
PCR Primer: A Laboratory Manual, Ed. by Dieffenbach, C. and Dveksler, G., Cold 
Spring Harbor Laboratory Press, 1995. When using RNA as a source of template, reverse 
transcriptase can be used to synthesize complimentary DNA (cDNA) strands. 

Isolated nucleic acids of the invention also can be chemically synthesized, either 
as a single nucleic acid molecule or as a series of oligonucleotides. For example, one or 
more pairs of long oligonucleotides (e.g., >100 nucleotides) can be synthesized that 
contain the desired sequence, with each pair containing a short segment of 
complementary (e.g., about 15 nucleotides) DNA such that a duplex is formed when the 
oligonucleotide pair is annealed. DNA polymerase is used to extend the oligonucleotides, 
resulting in a double-stranded nucleic acid molecule per oligonucleotide pair, which then 
can be ligated into a vector. 

Isolated nucleic acids of the invention also can be obtained by mutagenesis. For 
example, an isolated nucleic acid that shares identity with a sequence set forth in SEQ ID 
NO: 1 , 3, 5, 7, 9, 1 1, 38, or 44 can be mutated using common molecular cloning 
techniques (e.g., site-directed mutagenesis). Possible mutations include, without 
limitation, deletions, insertions, and substitutions, as well as combinations of deletions, 
insertions, and substitutions. Alignments of nucleic acids of the invention with other 
known sequences encoding carotenoid enzymes can be used to identify positions to 
modify. For example, alignment of the nucleotide sequence of SEQ ID NO: 5 with other 
nucleic acids encoding geranyl geranyl pyrophosphate synthases (e.g., from Erwinia 
uredovord) provides guidance as to which nucleotides can be substituted, which 
nucleotides can be deleted, and at which positions nucleotides can be inserted. 

In addition, nucleic acid and amino acid databases (e.g., GenBank^ can be used 
to obtain an isolated nucleic acid within the scope of the invention. For example, any 
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nucleic acid sequence having homology to a sequence set forth in SEQ ID NO: 1,3,5, 7, 
9, 1 1, 38, or 44, or any amino acid sequence having homology to a sequence set forth in 
SEQ ID NO: 2, 4, 6, 8, 10, 12, 39, or 45 can be used as a query to search GenBank®. 
Furthermore, nucleic acid hybridization techniques can be used to obtain an 
5 isolated nucleic acid within the scope of the invention. Briefly, any nucleic acid having 
some homology to a sequence set forth in SEQ ID NO: 1,3,5, 7, 9, 11, 38, or 44 can be 
used as a probe to identify a similar nucleic acid by hybridization under conditions of 
moderate to high stringency. Moderately stringent hybridization conditions include 
hybridization at about 42°C in a hybridization solution containing 25 mM KP0 4 (pH 7.4), 

10 5X SSC, 5X Denhart's solution, 50 )ag/mL denatured, sonicated salmon sperm DNA, 

50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5xl0 7 cpm/[ig), and 
wash steps at about 50°C with a wash solution containing 2X SSC and 0.1% SDS. For 
high stringency, the same hybridization conditions can be used, but washes are performed 
at about 65°C with a wash solution containing 0.2X SSC and 0.1% SDS. 

15 Once a nucleic acid is identified, the nucleic acid then can be purified, sequenced, 

and analyzed to determine whether it is within the scope of the invention as described 
herein. Hybridization can be done by Southern or Northern analysis to identify a DNA or 
RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with 
biotin, digoxygenin, an enzyme, or a radioisotope such as 32 P or 35 S. The DNA or RNA 

20 to be analyzed can be electrophoretically separated on an agarose or polyacrylamide gel, 
transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the 
probe using standard techniques well known in the art. See, for example, sections 7.39- 
7.52 of Sambrook et aL, (1989) Molecular Cloning, second edition, Cold Spring harbor 
Laboratory, Plainview, NY. 

25 

Polypeptides 

The present invention also features isolated zeaxanthin glucosyl transferase (SEQ 
ID NO:2), lycopene (3-cyclase (SEQ ID NO:4), geranylgeranyl pyrophosphate synthase 
(SEQ ID NO:6), phytoene desaturase (SEQ ID NO:8), phytoene synthase (SEQ ID 
30 NO: 10), and p-carotene hydroxylase (SEQ ID NO: 12) polypeptides. In addition, the 

invention features isolated P-carotene C4 oxygenase polypeptides (SEQ ID NO:39) and 
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multifunctional geranylgeranyl pyrophosphate synthase polypeptides (SEQ ID NO:45). 
A polypeptide of the invention can have at least 75% sequence identity, e.g., 80%, 85%, 
90%, 95%, or 99% sequence identity, to the amino acid sequence of SEQ ID NO:2 or to 
fragments thereof; at least 83% sequence identity, e.g., 85%, 90%, 95%, or 99% sequence 
identity, to the amino acid sequence of SEQ ID NO:4 or to fragments thereof; at least 
85% sequence identity, e.g., 90%, 95%, or 99% sequence identity, to the amino acid 
sequence of SEQ ID NO:6 or to fragments thereof; at least 90% sequence identity, e.g., 
90%, 92%, 95%, or 99% sequence identity, to the amino acid sequence of SEQ ID NO:8 
or to fragments thereof; at least 89% sequence identity, e.g., 90%, 95%, or 99% sequence 
identity, to the amino acid sequence of SEQ ID NO: 10 or to fragments thereof; at least 
90% sequence identity, e.g., 95%, or 99% sequence identity, to the amino acid sequence 
of SEQ ID NO: 12 or to fragments thereof; at least 60% sequence identity, e.g., 65%, 
70%, 75%, 80%, 85%, 90%, 95%, or 99% sequence identity, to the amino acid sequence 
of SEQ ID NO:39 or to fragments thereof; or at least 90% sequence identity, e.g., 95% or 
99% sequence identity, to the amino acid sequence set forth in SEQ ID NO:45 or to 
fragments thereof. Percent sequence identity can be determined as described above for 
nucleic acid molecules. 

An "isolated polypeptide" has been separated from cellular components that 
naturally accompany it. Typically, the polypeptide is isolated when it is at least 60% 
(e.g., 70%, 80%, 90%, 95%, or 99%), by weight, free from proteins and naturally- 
occurring organic molecules that are naturally associated with it. In general, an isolated 
polypeptide will yield a single major band on a non-reducing polyacrylamide gel. 

The term "polypeptide" includes any chain of amino acids, regardless of length or 
post-translational modification. Polypeptides that have identity to the amino acid 
sequences of SEQ ID NO:2, 4, 6, 8, 10, 12, 39, or 45 can retain the function of the 
enzyme (see FIG 1 for a schematic of the carotenoid biosynthesis pathway). For 
example, geranylgeranyl pyrophosphate synthase can produce geranylgeranyl 
pyrophosphate (GGPP) by condensing together isopentenyl pyrophosphate (IPP) with 
famesyl pyrophosphate (FPP). Phytoene synthase can produce phytoene by condensing 
together two molecules of GGPP. Phytoene desaturase can perform four successive 
desaturations on phytoene to form lycopene. Lycopene |3-cyclase can perform two 



- 14 - 



WO 02/079395 



PCT/US'02/02124 ir 



successive cyclization reactions on lycopene to form P-carotene. P-carotene hydroxylase 
can perform two successive hydroxylation reactions on (3-carotene to form zeaxanthin. 
Alternatively, P-carotene hydroxylase can perform two successive hydroxylation 
reactions on canthaxanthin to form astaxanthin. Zeaxanthin glucosyl transferase can add 
5 one or two glucose or other sugar moieties to zeaxanthin to form zeaxanthin 

monoglycoside or diglycoside, respectively. P-carotene C4 oxygenase can convert the 
methylene groups at the C4 and C4 5 positions of the P-carotene or zeaxanthin to form 
canthaxanthin or astaxanthin, respectively. Multifunctional geranylgeranyl 
pyrophosphate synthase can directly convert 3 IPP molecules and 1 dimethylallyl 

10 pyrophosphate (DMAPP) molecule to 1 GGPP molecule. 

In general, conservative amino acid substitutions, i.e., substitutions of similar 
amino acids, are tolerated without affecting protein function. Similar amino acids are 
those that are similar in size and/or charge properties. Families of amino acids with 
similar side chains are known. These families include amino acids with basic side chains 

15 ( e -g-> lysine, arginine, or histidine), acidic side chains (e.g., aspartic acid or glutamic 

acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, 
tyrosine, or cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, or tryptophan), P-branched side chains (e.g., 
threonine, valine, or isoleucine), and aromatic side chains (e.g., tyrosine, phenylalanine, 

20 tryptophan, or histidine). 

Mutagenesis also can be used to alter a nucleic acid such that activity of the 
polypeptide encoded by the nucleic acid is altered (e.g., to increase production of a 
particular carotenoid). For example, error-prone PCR (e.g., (GeneMorph PCR 
Mutagenesis Kit; Stratagene Inc. La Jolla, CA; Catalog # 600550; Revision #090001) can 

25 be used to mutagenize the B. aurantiaca crtW gene (SEQ ID NO:38) to increase the 

relative amount of di-keto carotenoid (e.g. astaxanthin (3,3'-dihydroxy-p,P-carotene-4,4'- 
dione) or canthaxanthin (P,p-carotene-4,4'-dione)) relative to mono-keto carotenoid (e.g. 
echinone (P,P-carotene-4-one) or adonixanthin (3,3'-dihydroxy-p,P-carotene-4-one)) that 
is produced. In general, the nucleic acid to be mutagenized can be cloned into a vector 

30 such as pCR-Blunt II-TOPO (Clontech; Palo Alto, CA) and used as a template for error- 
prone PCR. For purposes of directed evolution, mutation frequencies of 2-7 nucleotides / 



- 15 - 



BNSOOCID: <WO 



02079395A2J_> 



WO 02/079395 



PCT/US02/02124 



Kbp template (1-4 amino acids mutations / 333 Amino acids) generally are desired. 
Mutation frequency can be lowered or raised by increasing or decreasing the template 
concentration, respectively. PCR can be performed according to manufacturer's 
recommendations. Mutagenized nucleic acid is ligated into an expression vector, which is 
5 used to transform a host, and activity of the expressed protein is assessed. For example, 
in the case of the crtW gene, electrocompetent P. stewartii (ATCC 8200) cells can be 
prepared and transformed as described herein, and resulting individual colonies can be 
screened by visual inspection for a phenotypic change from bright yellow pigmentation 
(production of zeaxanthin), yellow orange (production of mono-keto carotenoid) or 

10 reddish-orange (production of di-keto carotenoid). Production of increased amounts of 
astaxanthin can be confirmed by HPLC/MS. 

Isolated polypeptides of the invention can be obtained, for example, by extraction 
from a natural source (e.g., a plant or bacteria cell), chemical synthesis, or by 
recombinant production in a host. For example, a polypeptide of the invention can be 

1 5 produced by ligating a nucleic acid molecule encoding the polypeptide into a nucleic acid 
construct such as an expression vector, and transforming a bacterial or eukaryotic host 
cell with the expression vector. In general, nucleic acid constructs include expression 
control elements operably linked to a nucleic acid sequence encoding a polypeptide of the 
invention (e.g., zeaxanthin glucosyl transferase, lycopene p-cyclase, geranylgeranyl 

20 pyrophosphate synthase, phytoene desaturase, phytoene synthase, P-carotene 

hydroxylase, p-carotene C4 oxygenase, or multifunctional geranylgeranyl pyrophosphate 
synthase polypeptides). Expression control elements do not typically encode a gene 
product, but instead affect the expression of the nucleic acid sequence. As used herein, 
"operably linked" refers to connection of the expression control elements to the nucleic 

25 acid sequence in such a way as to permit expression of the nucleic acid sequence. 

Expression control elements can include, for example, promoter sequences, enhancer 
sequences, response elements, polyadenylation sites, or inducible elements. Non-limiting 
examples of promoters include the puf promoter from Rhodobacter sphaeroides 
(GenBank Accession No. El 3945), the nifHDK promoter from R. sphaeroides (GenBank 

30 Accession No. AF03 1817), and the fliK promoter from R. sphaeroides (GenBank 
Accession No. U86454). 
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In bacterial systems, a strain of E. coli such as DH10B or BL-21 can be used. 
Suitable E. coli vectors include, but are not limited to, pUC18, pUC19, the pGEX series 
of vectors that produce fusion proteins with glutathione S-transferase (GST), and 
pBluescript series of vectors. Transformed E. coli are typically grown exponentially then 
5 stimulated with isopropylthiogalactopyranoside (IPTG) prior to harvesting. In general, 
fusion proteins produced from the pGEX series of vectors are soluble and can be purified 
easily from lysed cells by adsorption to glutathione-agarose beads followed by elution in 
the presence of free glutathione. The pGEX vectors are designed to include thrombin or 
factor Xa protease cleavage sites such that the cloned target gene product can be released 

1 0 from the GST moiety. 

In eukaryotic host cells, a number of viral-based expression systems can be 
utilized to express polypeptides of the invention. A nucleic acid encoding a polypeptide 
of the invention can be cloned into, for example, a baculoviral vector such as pBlueBac 
(Invitrogen, San Diego, CA) and then used to co-transfect insect cells such as Spodoptera 

15 frugiperda (Sf9) cells with wild-type DNA from Autographa californica multiply 
enveloped nuclear polyhedrosis virus (AcMNPV). Recombinant viruses producing 
polypeptides of the invention can be identified by standard methodology. Alternatively, a 
nucleic acid encoding a polypeptide of the invention can be introduced into a SV40, 
retroviral, or vaccinia based viral vector and used to infect suitable host cells. 

20 A polypeptide within the scope of the invention can be "engineered" to contain an 

amino acid sequence that allows the polypeptide to be captured onto an affinity matrix. 
For example, a tag such as c-myc, hemagglutinin, polyhistidine, or Flag™ tag (Kodak) 
can be used to aid polypeptide purification. Such tags can be inserted anywhere within 
the polypeptide including at either the carboxyl or amino termini. Other fusions that 

25 could be useful include enzymes that aid in the detection of the polypeptide, such as 
alkaline phosphatase. 

Agrobacterium-mQdiated transformation, electroporation and particle gun 
transformation can be used to transform plant cells. Illustrative examples of 
transformation techniques are described in U.S. Patent No. 5,204,253 (particle gun) and 

30 U.S. Patent No. 5,188,958 (Agrobacterium). Transformation methods utilizing the Ti and 
Ri plasmids of Agro bacterium spp. typically use binary type vectors. Walkerpeach, C. et 
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al., in Plant Molecular Biology Manual, S. Gelvin and R. Schilperoort, eds., Kluwer 
Dordrecht, CI: 1-19 (1994). If cell or tissue cultures are used as the recipient tissue for 
transformation, plants can be regenerated from transformed cultures by techniques known 
to those skilled in the art. 

Engineered cells 

Any cell containing an isolated nucleic acid within the scope of the invention is 
itself within the scope of the invention. This includes, without limitation, prokaryotic 
cells such as R. sphaeroides cells and eukaryotic cells such as plant, yeast, and other 
fungal cells. It is noted that cells containing an isolated nucleic acid of the invention are 
not required to express the isolated nucleic acid. In addition, the isolated nucleic acid can 
be integrated into the genome of the cell or maintained in an episomal state. In other 
words, cells can be stably or transiently transfected with an isolated nucleic acid of the 
invention. 

Any method can be used to introduce an isolated nucleic acid into a cell. In fact, 
many methods for introducing nucleic acid into a cell, whether in vivo or in vitro, are well 
known to those skilled in the art. For exaniple, calcium phosphate precipitation, 
conjugation, electroporation, heat shock, lipofection, microinjection, and viral-mediated 
nucleic acid transfer are common methods that can be used to introduce nucleic acid 
molecules into a cell. In addition, naked DNA can be delivered directly to cells in vivo as 
describe elsewhere (U.S. Patent Nos. 5,580,859 and 5,589,466). Furthermore, nucleic 
acid can be introduced into cells by generating transgenic animals. 

Any method can be used to identify cells that contain an isolated nucleic acid 
within the scope of the invention. For example, PCR and nucleic acid hybridization 
techniques such as Northern and Southern analysis can be used. In some cases, 
immunohistochemistry and biochemical techniques can be used to determine if a cell 
contains a particular nucleic acid by detecting the expression of a polypeptide encoded by 
that particular nucleic acid. For example, the polypeptide of interest can be detected with 
an antibody having specific binding affinity for that polypeptide, which indicates that that 
cell not only contains the introduced nucleic acid but also expresses the encoded 
polypeptide. Enzymatic activities of the polypeptide of interest also can be detected or an 
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end product (e.g., a particular carotenoid) can be detected as an indication that the cell 
contains the introduced nucleic acid and expresses the encoded polypeptide from that 
introduced nucleic acid. 

The cells described herein can contain a single copy, or multiple copies (e.g., 
5 about 5, 10, 20, 35, 50, 75, 100 or 150 copies), of a particular exogenous nucleic acid. 

All non-naturally-occurring nucleic acids are considered an exogenous nucleic acid once 
introduced into the cell. The term "exogenous" as used herein with reference to a nucleic 
acid and a particular cell refers to any nucleic acid that does not originate from that 
particular cell as found in nature. Nucleic acid that is naturally-occurring also can be 

10 exogenous to a particular cell. For example, an entire operon that is isolated from a 

bacteria is an exogenous nucleic acid with respect to a second bacteria once that operon is 
introduced into the second bacteria. For example, a bacterial cell (e.g., Rhodobacter) can 
contain about 50 copies of an exogenous nucleic acid of the invention. In addition, the 
cells described herein can contain more than one particular exogenous nucleic acid. For 

15 example, a bacterial cell can contain about 50 copies of exogenous nucleic acid X as well 
as about 75 copies of exogenous nucleic acid Y. In these cases, each different nucleic 
acid can encode a different polypeptide having its own unique enzymatic activity. For 
example, a bacterial cell can contain two different exogenous nucleic acids such that a 
high level of astaxanthin or other carotenoid is produced. In addition, a single exogenous 

20 nucleic acid can encode one or more polypeptides. For example, a single nucleic acid can 
contain sequences that encode three or more different polypeptides. 

Microorganisms that are suitable for producing carotenoids may or may not 
naturally produce carotenoids, and include prokaryotic and eukaryotic microorganisms, 
such as bacteria, yeast, and fungi. In particular, yeast such as Phaffia rhodozyma 

25 {Xanthophyllomyces dendrorhous), Candida utilis, and Saccharomyces cerevisiae, fungi 
such as Neurospora crassa, Phycomyces blakesleeanus, Blakeslea trispora, and 
Aspergillus sp z Archaeabacteria such as Halobacterium salinarium, and Eubacteria 
including Pantoea species (formerly called Erwinia) such as Pantoea stewartii (e.g., 
ATCC Accession #8200), flavobacteria species such as Xanthobacter autotrophics and 

30 Flavobacterium multivorum, Zymonomonas mobilis, Rhodobacter species such as R. 

sphaeroides and R. capsulatus, E. coli, and E. vulneris can be used. Other examples of 



- 19- 



BNSDOCID: <WO. 



0207939 5A2J_> 



WO 02/079395 



PCT/US02/02124 



bacteria that may be used include bacteria in the genus Sphingomonas and Gram negative 
bacteria in the a-subdivision, including, for example, Paracoccus, Azotobacter, 
Agrobacterium, and Erythrobacter. Eubacteria, and especially R. sphaeroides and R. 
capsulatus, are particularly useful. R. sphaeroides and R. capsulatus naturally produce 
5 certain carotenoids and grows on defined media. Such Rhodobacter species also are non- 
pyrogenic, minimizing health concerns about use in nutritional supplements. In some 
embodiments, it can be useful to produce carotenoids in plants and algae such as Zea 
mays, Brassica napus, Lycopersicon esculentum, Tagetes erecta, Haematococcus 
pluvialis, Dunaliella salina, Chlorella protothecoides, and Neospongiococcum 
1 0 excentrum. 

It is noted that bacteria can be membranous or non-membranous bacteria. The 
term "membranous bacteria" as used herein refers to any naturally-occurring, genetically ■ 
modified, or environmentally modified bacteria having an intracytoplasmic membrane. 
An intracytoplasmic membrane can be organized in a variety of ways including, without 

15 limitation, vesicles, tubules, thylakoid-like membrane sacs, and highly organized 
membrane stacks. Any method can be used to analyze bacteria for the presence of 
intracytoplasmic membranes including, without limitation, electron microscopy, light 
microscopy, and density gradients. See, e.g., Chory et al., (1984) J. BacterioL, 159:540- 
554; Niederman and Gibson, Isolation and Physiochemical Properties of Membranes 

20 from Purple Photosynthetic Bacteria. In: The Photosynthetic Bacteria, Ed. By Roderick 
K. Clayton and William R. Sistrom, Plenum Press, pp. 79-1 18 (1978); and Lueking et al., 
(1978) J. Biol. Chem., 253: 451-457. Examples of membranous bacteria that can be used 
include, without limitation, Purple Non-Sulfur Bacteria, including bacteria of the 
Rhodospirillaceae family such as those in the genus Rhodobacter (e.g., R. sphaeroides 

25 and R. capsulatus), the genus Rhodospirillum, the genus Rhodopseudomonas, the genus 
Rhodomicrobium, and the genus Rhodophila. The term "non-membranous bacteria" 
refers to any bacteria lacking intracytoplasmic membrane. Membranous bacteria can be 
highly membranous bacteria. The term "highly membranous bacteria" as used herein 
refers to any bacterium having more intracytoplasmic membrane than R. sphaeroides 

30 (ATCC 17023) cells have after the R. sphaeroides (ATCC 17023) cells have been 

(1) cultured chemoheterotrophically under aerobic condition for four days, (2) cultured 
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chemoheterotrophically under anaerobic for four hours, and (3) harvested. Aerobic 
culture conditions include culturing the cells in the dark at 30°C in the presence of 25% 
oxygen. Anaerobic culture conditions include culturing the cells in the light at 30°C in 
the presence of 2% oxygen. After the four hour anaerobic culturing step, the R. 
sphaeroides (ATCC 17023) cells are harvested by centrifugation and analyzed. 

Nucleic acids of the invention can be expressed in microorganisms so that 
detectable amounts of carotenoids are produced. As used herein, "detectable" refers to 
the ability to detect the carotenoid and any esters or glycosides thereof using standard 
analytical methodology. In general, carotenoids can be extracted with an organic solvent 
such as acetone or methanol and detected by an absorption scan from 400-500 nm in the 
same organic solvent. In some cases, it is desirable to back-extract with a second organic 
solvent, such as hexane. The maximal absorbance of each carotenoid depends on the 
solvent that it is in. For example, in acetone, the maximal absorbance of lutein is at 45 1 
nm, while maximal absorbance of zeaxanthin is at 454 nm. In hexane, the maximal 
absorbance of lutein and zeaxanthin is 446 nm and 450 nm, respectively. High 
performance liquid chromatography coupled to mass spectrometry also can be used to 
detect carotenoids. Two reverse phase columns that are connected in series can be used 
with a solvent gradient of water and acetone. The first column can be a C30 specialty 
column designed for carotenoid separation (e.g., YMCa Carotenoid S3m; 2.0 x 150 mm, 
3mm particle size; Waters Corporation, PN CT99S031502WT) followed by a C8 Xterraa 
MS column (e.g., Xterraa MS C8; 2.1 x 250 mm, 5mm particle size; Waters Corporation, 
PN 186000459). 

Detectable amounts of carotenoids include lOjig/g dry cell weight (dew) and 
greater. For example, about 10 to 100,000^ig/g dew, about 100 to 60,000^g/g dew, about 
500 to 30,000n.g/g dew, about 1000 to 20,000 jig/g dew, about 5,000 to 55,000 ng/g dew, 
or about 30,000 fig/g dew to about 55,000 jig/g dew. With respect to algae or other plants 
or organisms that produce a particular carotenoid, such as astaxanthin, p-carotene, 
lycopene, or zeaxanthin, "detectable amount" of carotenoid is an amount that is detectable 
over the endogenous level in the plant or organism. 

Depending on the microorganism and the metabolites present within the 
microorganism, one or more of the following enzymes may be expressed in the 
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microorganism: geranylgeranyl pyrophosphate synthase, phytoene synthase, phytoene 
desaturase, lycopene p cyclase, lycopene s cyclase, zeaxanthin glycosyl transferase, 
P-carotene hydroxylase, P-carotene C-4 ketolase, and multifunctional geranylgeranyl 
pyrophosphate synthase. Suitable nucleic acids encoding these enzymes are described 
above. Also, see, for example, Genbank Accession No. Yl 5 1 12 for the sequence of 
carotenoid biosynthesis genes of Paracoccus marcusii; Genbank Accession No. D58420 
for the carotenoid biosynthesis genes of Agrobacterium aurantiacum; Genbank Accession 
No. M87280 M99707 for the sequence of carotenoid biosynthesis genes of Erwinia 
herbicola; and Genbank Accession No. U62808 for carotenoid biosynthesis genes of 
Flavobacterium sp. Strain R1534. 

For example, to produce lycopene in a microorganism that naturally produces 
neurosporene, such as Rhodobacter, an exogenous nucleic acid encoding phytoene 
desaturase can be expressed, e.g., a phytoene desaturase of the invention, and lycopene 
can be detected using standard methodology. Expression of additional carotenoid genes 
in such an engineered cell will allow for production of additional carotenoids. For 
example, expression of a lycopene p-cyclase in such an engineered cell allows production 
of detectable amounts of P-carotene, while further expression of a P-carotene hydroxylase 
allows production of another carotenoid, zeaxanthin. p-carotene and zeaxanthin can be 
detected using standard methodology and are distinguished by mobility on an HPLC 
column. Zeaxanthin diglucoside can be produced by further expression of zeaxanthin 
glucosyl transferase (crtX) in an organism that. produces zeaxanthin. 

• Alternatively, canthaxanthin can be produced in organisms that produce phytoene 
by expression of phytoene desaturase, lycopene P-cyclase, and P-carotene C4 oxygenase, 
an enzyme that converts the methylene groups at the C4 and C4' positions of the 
carotenoid to ketone groups. The P-carotene C4 oxygenase from, e.g., Agrobacterium 
aurantiacum or Haematococcus pluvialis can be used. See, GenBank Accession Nos. 
1 136630 and X86782 for a description of the nucleotide and amino acid sequences of the 
A. aurantiacum and K pluvialis enzymes, respectively. The P-carotene C4 oxygenase 
from Brevundimonas aurantiaca also can be used. See, Example 2 for a description of 
the nucleotide and amino acid sequences. In organisms that do not naturally produce 
carotenoids, additional enzymes are required for production of canthaxanthin. 
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Geranylgeranyl pyrophosphate synthase and phytoene synthase can be expressed such 
that the necessary precursors for canthaxanthin synthesis are present. 

Astaxanthin also can be produced in microorganisms that naturally produce 
carotenoids. For example, a Rhodobacter cell can be engineered such that phytoene 
5 desaturase, lycopene p-cyclase, p-carotene hydroxylase, and p-carotene C4 oxygenase are 
expressed and detectable amounts of astaxanthin are produced. Such an organism also 
can express an enzyme that can modify the 3 or 3' hydroxyl groups of astaxanthin with 
chemical groups such as glucose (e.g., to produce astaxanthin diglucoside), other sugars, 
or fatty acids. In addition, a P. stewartii cell can be engineered such that p-carotene C4 
1 0 oxygenase is expressed and detectable amounts of astaxanthin are produced. Astaxanthin 
can be detected as described above, and has maximal absorbance at 480 nm in acetone. 

Yields of astaxanthin and other carotenoids can be increased by expression of a 
multifunctional geranylgeranyl pyrophosphate synthase, such as that from S. shibatae 
(SEQ ID NO:45) or an Archaebacterial gene from Archaeoglobus fulgidus (GenBank 
15 Accession No. AF 120272), in the engineered microorganism. The archaebacteria GGPPS 
gene is a homolog of the endogenous Rhodobacter gene and encodes an enzyme that 
directly converts 3 IPP molecules and 1 DMAPP molecule to 1 GGPPS molecule, thereby 
reducing branching of the carotenoid pathway and eliminating production of other less 
desirable isoprenoids. Further reductions in less desirable metabolites can be obtained by 
20 eliminating endogenous bacteriochlorophyll biosynthesis, which redirects flow into 

carotenoid biosynthesis. For example, the bchO, bchD, and bchl genes can be deleted 
and/or replaced with an Archaebacterial GGPPS gene. Additional increases in yield can 
be obtained by deletion of the endogenous crtE gene or the endogenous crtC, crtD, crtE, 
crtA, crtl, and crtF genes. 
25 Common mutagenesis or knock-out technology can be used to delete endogenous 

genes. Alternatively, antisense technology can be used to reduce enzymatic activity. For 
example, a R. sphaeroides cell can be engineered to contain a cDNA that encodes an 
antisense molecule that prevents an enzyme from being made. The term "antisense 
molecule" as used herein encompasses any nucleic acid that contains sequences that 
30 correspond to the coding strand of an endogenous polypeptide. An antisense molecule 

also can have flanking sequences (e.g., regulatory sequences). Thus, antisense molecules 
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can be ribozymes or antisense oligonucleotides. A ribozyme can have any general 
structure including, without limitation, hairpin, hammerhead, or axhead structures, 
provided the molecule cleaves RNA. 

5 Control of the Ratio of Carotenoids 

The amount of particular carotenoids, such as astaxanthin to canthaxanthin, or 
astaxanthin to zeaxanthin, can be controlled by expression of carotenoid genes from an 
inducible promoter or by use of constitutive promoters of different strengths. As used 
herein, "inducible" refers to both up-regulation and down regulation. An inducible 

10 promoter is a promoter that is capable of directly or indirectly activating transcription of 
one or more DNA sequences or genes in response to an inducer. In the absence of an 
inducer, the DNA sequences or genes will not be transcribed. The inducer can be a 
chemical agent such as a protein, metabolite, growth regulator, phenolic compound, or a 
physiological stress imposed directly by heat, cold, salt, or toxic elements, or indirectly 

15 through the action of a pathogen or disease agent such as a virus. The inducer also can be 
an illumination agent such as light, darkness and light's various aspects, which include 
wavelength, intensity, fluorescence, direction, and duration. Examples of inducible 
promoters include the lac system and the tetracycline resistance system from E. coli. In 
one version of the lac system, expression of lac operator-linked sequences is 

20 constitutively activated by a lacR-VP16 fusion protein and is turned off in the presence of 
IPTG. In another version of the lac system, a lacR-VP16 variant is used that binds to lac 
operators in the presence of IPTG, which can be enhanced by increasing the temperature 
of the cells. 

Components of the tetracycline (Tc) resistance system also can be used to regulate 
25 gene expression. For example, the Tet repressor (TetR), which binds to tet operator 

sequences in the absence of tetracycline and represses gene transcription, can be used to 
repress transcription from a promoter containing tet operator sequences. TetR also can be 
fused to the activation domain of VP 16 to create a tetracycline -controlled transcriptional 
activator (tTA), which is regulated by tetracycline in the same manner as TetR, i.e., tTA 
30 binds to tet operator sequences in the absence of tetracycline but not in the presence of 
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tetracycline. Thus, in this system, in the continuous presence of Tc, gene expression is 
repressed, and to induce transcription, Tc is removed. 

Alternative methods of controlling the ratio of carotenoids include using enzyme 
inhibitors to regulate the activity levels of particular enzymes. 

Production of Carotenoids 

Carotenoids can be produced in vitro or in vivo. For example, one or more 
polypeptides of the invention can be contacted with an appropriate substrate or 
combination of substrates to produce the desired carotenoid (e.g., astaxanthin). See, FIG. 
1 for a schematic of the carotenoid biosynthetic pathway. 

A particular carotenoid (e.g., astaxanthin, lycopene, P-carotene, lutein, zeaxanthin, 
zeaxanthin diglucoside, or canthaxanthin) also can be produced by providing an 
engineered microorganism and culturing the provided microorganism with culture 
medium such that the carotenoid is produced. In general, the culture media anoVor culture 
conditions are such that the microorganisms grow to an adequate density and produce the 
desired compound efficiently. For large-scale production processes, the following 
methods can be used. First, a large tank (e.g., a 100 gallon, 200 gallon, 500 gallon, or 
more tank) containing appropriate culture medium with, for example, a glucose carbon 
source is inoculated with a particular microorganism. After inoculation, the 
microorganisms are incubated to allow biomass to be produced. Once a desired biomass 
is reached, the broth containing the microorganisms can be transferred to a second tank. 
This second tank can be any size. For example, the second tank can be larger, smaller, or 
the same size as the first tank. Typically, the second tank is larger than the first such that 
additional culture medium can be added to the broth from the first tank. In addition, the 
culture medium within this second tank can be the same as, or different from, that used in 
the first tank. For example, the first tank can contain medium with xylose, while the 
second tank contains medium with glucose. 

Once transferred, the microorganisms can be incubated to allow for the 
production of the desired carotenoid. Once produced, any method can be used to isolate 
the desired compound. For example, if the microorganism releases the desired carotenoid 
into the broth, then common separation techniques can be used to remove the biomass 
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from the broth, and common isolation procedures (e.g., extraction, distillation, and ion- 
exchange procedures) can be used to obtain the carotenoid from the microorganism-free 
broth. In addition, the desired carotenoid can be isolated while it is being produced, or it 
can be isolated from the broth after the product production phase has been terminated. If 
5 the microorganism retains the desired carotenoid, the biomass can be collected and the 
carotenoid can be released by treating the biomass or the carotenoid can be extracted 
directly from the biomass. Extracted carotenoid can be formulated as a nutraceutical. As 
used herein, a nutraceutical refers to a compound(s) that can be incorporated into a food, 
tablet, powder, or other medicinal form that, upon ingestion by a subject, provides a 
1 0 specific medical or physiological benefit to the subject. 

Alternatively, the biomass can be collected and dried, without extracting the 
carotenoids. The biomass then can be formulated for human consumption (e.g., as a 
dietary supplement) or as an animal feed (e.g., for companion animals such as dogs, cats, 
and horses, or for production animals). For example, the biomass can be formulated for 
1 5 consumption by poultry such as chickens and turkeys, or by cattle, pigs, and sheep. 

Feeding of such compositions may increase yield of breast meat in poultry and may 
increase weight gain in other farm animals. In addition, the carotenoids may increase 
shelf-life of meat products due to the increased antioxidant protection afforded by the 
carotenoids. The biomass also can be formulated for use in aquaculture. For example, 
20 biomass that includes an engineered microorganism that is producing, e.g., astaxanthin 
and/or canthaxanthin, can be fed to fish or crustaceans to pigment the flesh or carapace, 
respectively. Such a composition is particularly useful for feeding to fish such as salmon, 
trout, sea breem, or snapper, or crustaceans such as shrimp, lobster, and crab. 

One or more components can be added to the biomass before or after drying, 
25 including vitamins, other carotenoids, antioxidants such as ethoxyquin, vitamin E, 

butylated hydroxyanisole (BHA), butylated hydroxytoluene (BHT), or ascorbyl palmitate, 
vegetable oils such as corn oil, safflower oil, sunflower oil, or soybean oil, and an edible 
emulsifier, such as soy bean lecithin or sorbitan esters. Addition of antioxidants and 
vegetable oils can help prevent degradation of the carotenoid during processing (e.g., 
30 drying), shipment, and storage of the composition. 
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The invention will be further described in the following examples, which do not 
limit the scope of the invention described in the claims. 



Example 



EXAMPLES 

1 - Cloning of the zeaxanthin gene cluster from Pantoea stewartii: 



Genomic DNA from P. stewartii was isolated and digested with restriction enzymes to 
yield genomic DNA fragments approximately 8-10 kB in size. These genomic DNA 
fragments were ligated into a vector cut with the same restriction enzyme, and 
electroporated into electrocompetent E. coli. Transformant colonies were individually 
1 0 picked and transferred onto fresh solid media with the appropriate antibiotic selection 
(ampicillin/ampicillin substitute). It was thought that E. coli colonies containing the P. 
stewartii carotenoid genes would appear yellow in color due to the production of 
zeaxanthin pigment or red due to the production of lycopene. Although at least 2000 
ampicillin resistant E. coli transformants were screened, none of the colonies were found 
1 5 to contain the P. stewartii carotenoid genes. 

Instead, a second, PCR based method was used to identify and sequence the 
carotenoid (erf) gene cluster from P. stewartii genomic DNA. Degenerate primers were 
designed based on homologous regions identified in the crt genes from Erwinia herbicola 
and Erwinia uredovora. Table 2 provides the position of the crt genes in E. herbicola and 
20 E. uredovora. 

TABLE 2 



Position of crt eenes in E. herbicola and E. uredovora 


Gene name 


Start of Gene (nucleotide #) 


End of Gene (nucleotide #) 


E. herbicola 


E. uredovora 


E. herbicola 


E. uredovora 


CrtE 


3535 


198 


4458 


1133 


Orf-6 


4521 




5564 




CrfX 


5561 


1143 


6802 


2438 


CrtY 


6799 


2422 


7959 


3570 


CrtI 


7956 


3582 


9434 


5060 


CrtB 


9431 


5096 


10360 


5986 


CrtZ 


10826 

(complement) 


6452 

(complement) 


10296 

complement 


5925 

(complement) 


Orf-12 


12127 

complement 




10916 

complement 
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The following primers were designed (Table 3) and used in various combinations 
to yield PCR products of varying lengths. P. stewartii genomic DNA was used as 
template. 



TABLE 3 
Sequences of Degenerate Primers 


Primer Name 


rrimer oequence 


SEQ ID 
NO 


r.S.JDUriy 1 


S' ATYATr T r APGGCTGGGGWTGGSGMTGGCA - 3' 


13 


r.S. oUriyz 


^' fTr T rr A R POYTG ATGC APC AGMCCGTCRTGCA - 3' 


14 


r.S.rol 


v PTn ATGPTPTAYGrrTGGTGrrGrrA - 3 1 


15 


Ps PS? 


5' - TCGCGRGCRATRTTSGTCARCTG - 3' 


16 


P.s.LBCl 


5 1 - ATBMTSATGGAYGCSACSGT - 3' 


17 


P.S.LBC2 


5' - YTRATC G ARG AYACG CRCTA - 3' 


18 


P.S.LBC3 


5' - RSGGCAGYGAATAGCCRGTG - 3' 


19 


P.S.LBC4 


5' - AACAGCATSCGRTTCAGCAKGCGSA - 3' 


20 


P.S.PD5 


5" - CCGACGGTKATCACCGATCC - 3" 


21 


P.S.PD6 


5' - CTGCGCCSACCAGGTAGAG - 3' 


22 


PsGGPPSl 


5' - CTYGACGAYATGCCCTGCATGGAC - 3' (MD92) 


23 


P.S.GGPPS2 


5' - GTCGATTTWCCSGCGTCCTKATTG - 3' (MD93) 


24 



PCR was performed in a Gradient Thermocycler, and was started by incubating at 
96°C for 5 minutes, followed by 40 cycles of denaturation at 96°C for 30 seconds, 
annealing at 40°C/45 o C/50 o C/55°C/or 60°C for 105 seconds, and extension at 72°C for 
10 90 seconds, followed by incubation at 72°C for 10 mins. The concentration of MgCl 2 in 
the PCR reactions also was varied and ranged from a final concentration of 1 .5 mM to 6 
mM. Table 4 provides the predicted size of the PCR products with various primer 
combinations. 
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TABLE 4 



Ex 


oected sizes of PCR Products 


Primer Combination 


PCR product length (bp) 


Product Observed 


BCHyl/BCHy2 


230 


Yes 


PS1/PS1 


410 


Yes 


LBC1/LBC3 




1 cb 


LBC1/LBC4 


460 


Yes 


PD1/PD2 




No 


PD1/PD4 


IzoU 




LBC2/LBC3 


240 


No 


PD3/PD4 


410 


Yes 


LBC2/LBC4 


380 


Yes 


PD5/PD6 


1200 


Yes 


PS1/PS2 


410 


i es 


BCHyl/BCHy2 


230 


Yes 


PsGGPPS 1 /PsGGPPS2 


470 


Yes 


LBCDownl/PDUpl 


470 


Yes 


PDDownl/PSUpl 


300 


Yes 


BCHyDown 1 /P SDown 1 


700 


Yes 


LBCUpl/GGPPSdnl 


1600 


Yes 



PCR reactions were electrophoresed through agarose gels to estimate sizes of PCR 
products and DNA was extracted from the gel using a Qiagen gel extraction kit. The 
purified PCR products were submitted to the Advanced Genetic Analysis Center (AGAC) 
at the University of Minnesota for sequencing. The obtained DNA sequences were 
subjected to BLAST analysis to determine if the sequences were homologous to crt genes 
from other bacteria. Sequence analysis of the 1 .2-kb DNA fragment indicated that there 
was homology to phytoene desaturase (crtl) genes from E. herbicola and E. uredovora, 
while the 0.47 kB product had homology with the crtE genes from E. herbicola and E. 
uredovora. 

Based on the DNA sequence information generated using the degenerate primers 
and amplified regions of the carotenoid genes from P. stewartii, primers specific for the 
P. stewartii crt genes were designed and are shown in Table 5. These specific primers 
were used to obtain information upstream and downstream of the DNA regions amplified 
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with the degenerate primers. This rationale was used to extend and obtain DNA sequence 
information about the P. stewartii crt genes. 



TABLE 5 



10 



15 



20 



Primer 


Sequence 


SEQ 

ID 

NO 


PsOp.crtE 


5 '-GGCCGAATTCC A ACGATGCTCTGGC AGTTA-3 ' 


25 


PSOp.crtZ(-) 


5 ' -GGCC AGATCTACTTC AGGCG ACGCTGAGAG-3 ' 


26 


PsOp.crtZ(+) 


5 ' -GGCC AGATCTTACGCGCGGGTAA AGCCAAT-3 ' 


27 


PsOp.crtZ(2+) 


5'-GGCCTCTAGAATTACCGCGTGGTTCTGAAG-3' 


28 


PsOp.crtZ(2-) 


5 ' -GGCCTCTAG ATCTGTACGCGCC ACCGTTAT-3 ' 


29 



After unsuccessful attempts at completing the sequence crt gene cluster sequence 
from P. stewartii using PCR, the Universal Genome Walker kit from Clontech was used 
to obtain the complete the sequence of the P. stewartii crtE and crtZ genes. This kit uses a 
PCR based approach. The following primer pairs were synthesized and used for the 
genome walking experiments: GWcrtE2 ? 5' - 

CATCGGTAAGATCGTCAAGCAACTGAA - 3' (SEQ ID NO:30) and GWcrtEl, 5' - 
GATTTACCTGCATCCTGATTGATGTCT - 3' (SEQ ID NO:3 1); and GWcrtZl , 5 f - 
ATGTATAACCGTTTCAGGTAGCCTTTG - 3' (SEQ ID NO:32) and GWcrtZ2, 5' - 
AATACAGTAAACCATAAGCGGTCATGC - 3* (SEQ ID NO:33). The sequences of 
the crt genes and encoded proteins from P. stewartii were compared to the sequence of 
the crt genes and proteins from E. herbicola and E. uredovora using BLAST under 
default parameters. See, SEQ ID NOS 1-12 for the nucleotide and amino acid sequences 
of the P. stewartii crt genes. The results of the alignment are provided in Table 6. 



TABLE 6 

Comparison of crt genes and proteins from P. stewartii to E. herbicola and E. 

uredovora 





Comparison of nucleotide 
sequence of P. stewartii to 


Comparison of protein sequence 
of P. stewartii to 


Gene 


E. herbicola 


E. uredovora 


E. herbicola 


E. uredovora 


crtE 


59% 


80% 


81% 


83% 


crtX 


56% 


75% 


75% 


74% 


crtY 


58% 


77% 


83% 


82% 
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Comparison of nucleotide 
sequence of P. stewartii to 


Comparison of pr 
of P. stewartii to 


otein sequence 


Gene 


E. herbicola 


E. uredovora 


E. herbicola 


E. uredovora 


crtl 


69% 


81% 


89% 


89% 


crtB 


63% 


1 81% 


88% 


88% 


crtZ 


65% 


84% 


65% 


88% 



F^mnle 2 - Cloning of a B-ca m tene C4 Oxygenase from Brevundimonas 
aurantiaca: Degenerate PCR primers for crtW were designed based on crtW genes from 
Bradyrhizobium, Alcaligenes, Agrobacterium aurantiacum, and Paracoccus marcusii. 
5 The primers had the following sequences: {crt W(1S1 P.m.) - 

5TTCATCATCGCGCATGAC3' (SEQ ID NO:34) and c>W(668P.m.)- 
5AGRTGRTGYTCGTGRTGA (SEQ ID NO:35), and were synthesized by Integrated 
DNA Technologies Inc. (Coralville, IA). PCR was performed in a mastercycler gradient 
machine (Eppendorf) with genomic DNA from B. aurantiaca (ATCC Accession No. 
1 0 1 5266). Reaction conditions included five minutes at 96°C, followed by 30 cycles of 

denaturation at 94°C for 30 sec, annealing at 50°C for 2 min., and extension at 72°C for 2 
min 30 sec, and a final 72°C incubation for 10 min. An approximately 500-bp PCR 
product was obtained and cloned into the vector pCR-Bluntll-TOPO (Invitrogen Corp. 
Carlsbad, CA). 

15 Independent clones were sequenced using the universal M13 forward and reverse 

primers. DNA sequencing was carried out at AG AC, University of Minnesota, St. Paul, 
MR Partial nucleotide sequence of the crtW gene was obtained. Alignment of the partial 
sequence with known crt W genes indicated that the sequences aligned toward the 
N-terminus and C-terminus, respectively, of the crtW genes from Bradyrhizobium, 

20 Alcaligenes, Agrobacterium aurantiacum, and Paracoccus marcusii. The Universal 

Genome Walker kit from Clontech was used to obtain the complete the sequence of the B. 
aurantiaca crtW gene. Primers were synthesized based on the partial sequence and used 
for the genome walking experiments. 

Upon obtaining sequence from the ends of the gene, the following oligonucleotide 

25 primers were synthesized and used to amplify the complete crtW gene from genomic 

DNA: 5 ' -GCGGC AT AGGCTAG ATTG AAG-3 ' (primer 1, Tm = 72°C, SEQ ID NO:36) 
and 5'-GCGAGTTCCTTCTCACCTAT-3' (primer 2, Tm = 67°C, SEQ ID NO:37). B. 
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aurantiaca (ATCC 15266) genomic DNA was prepared with the Qiagen genomic-tip 
500G kit (Valencia, CA; Catalog # 10262) following the manufacturers protocol. Briefly, 
30 ml of B. aurantiaca culture were grown overnight at 30°C in ATCC medium 36 
(Caulobacter medium; 2g/l peptone, 1 g/1 yeast extract, 0.2 g/1 MgSO4.7H20). Cultures 
5 were harvested by centrifugation (1 5,000 x g; 10 minutes) and genomic DNA purified 

following the manufacturer's recommended protocol (Qiagen Genomic DNA Handbook 
for Blood, Cultured Cells, Tissue, Mouse Tails, Yeast, Bacteria (Gram- & some Gram+). 
The Expand DNA polymerase system (Roche Molecular Biochemicals, Indianapolis, IN; 
catalog # 1732641) was used in a reaction that included 2 \il of B. aurantiaca genomic 

10 DNA (50 ng/jal), 1 yil of primer 1 (100 pmol/^il), 1 p.1 of primer 2 (100 pmol/^il), 5 of 

lOx PCR buffer, 1 j^I of Expand DNA polymerase (3.5 U/^l), 2.5 [i\ of dimethyl sulfoxide 
(DMSO), 2 ^1 of dNTP's (10 nmol/jal each), and 35.5 yd of dd H 2 0. Reaction conditions 
included five minutes at 96°C, followed by 30 cycles of denaturation at 94°C for 30 sec, 
annealing at 50°C for 2 min., and extension at 72°C for 2 min 30 sec, and a final 72°C 

1 5 incubation for 1 0 min. 

PCR products were electrophoresed through a 0.8% agarose gel and the -0.85 kB 
band was excised from the gel and purified using the Qiagen QIAquick Gel Extraction 
Kit (catalog #28704) following the manufacturer's recommended protocol (QIAquick 
Spin Handbook). Gel-purified PCR product was cloned into the blunt-end cloning site of 

20 pCR-Blunt II-TOPO (Clontech; Palo Alto, CA) to generate pTOPOcrtW. Ligation 
mixtures were electroporated (25 jj.F, 200 Ohms, 12.5 KV/cm) into E. coli DH10B 
electromax cells (Gibco BRL; Gaithersburg, MD; catalog #1 8290-015). Transformants 
were allowed to recover 60 minutes at 37°C with shaking in 1 ml of SOC medium. Cells 
were plated on LB agar + 50 jig/ml kanamycin and allowed to grow overnight at 37°C. 

25 Transformant colonies were inoculated into 1 ml LB broth + 50 jig/ml kanamycin and 
allowed to grow overnight at 37°C with shaking. Minipreps were prepared using the 
QIAprep Spin Miniprep Kit (50) (catalog #27104) following the manufacturer's protocol 
and the presence of pTOPOcrtW was screened for by restriction analysis with Eco RI. 
EcoRl digests of pTOPOcrtW yielded products of -0.85 Kbp and 3.5 Kbp. 
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The crtW gene was sequenced by AGAC, University of Minnesota, St. Paul, MN. 
The nucleotide sequence of the crtW gene from B. aurantiaca is provided in SEQ ID 
NO.38, and the protein encoded by the crtW gene is provided in SEQ ID NO:39. 

5 Example 3 - Transformation of nTOPOc rtW into Pantoea stewartii and, 

nroduction of astaxanthin a n d adonixanthin in P.5fe»^ar/»::pTOPOcrtW: The 

following protocol describes expression of crtW in the zeaxanthin producing host P. 
stewartii. This yields a transformed host that is capable of producing astaxanthin (i.e., 
3,3'-dihydroxy-P,(3-carotene-4,4'-dione) and adonixanthin (3,3'-dihydroxy-p,P-carotene- 

10 4-one). Electrocompetent P. stewartii (ATCC 8200) cells were prepared by culturing 50 
ml of a 5% inoculum of P. stewartii cells in LB at 30°C -with agitation (250 rpm) until an 
OD 590 of 0.5-1.0 was reached. The bacteria were washed in 50 ml of lOmM HEPES (pH 
7.0) and centrifuged for 10 minutes at 10,000xg. The wash was repeated with 25 ml of 
lOmM HEPES (pH 7.0) followed by the same centrifugation protocol. The cells then 

1 5 were washed once in 25 ml of 1 0% glycerol. Following centrifugation, the cells were 

resuspended in 500 ul of 10% glycerol. Forty ul aliquots were frozen and kept at -80°C 
until use. 

Plasmid TOPOcrtW was electroporated into electrocompetent P. stewartii cells 
(25 uF, 25 KV/cm, 200 Ohms) and plated onto LB agar plates containing 50 ug/ml 
20 kanamycin. As a negative control, pCR-Blunt II-TOPO self-ligated parental vector also 
was electroporated into P. stewartii and plated onto LB agar plates containing 50 ug/ml 
kanamycin. Individual colonies of P. stewartii: : P TOPOcrtW were screened by visual 
inspection for a phenotypic change from bright yellow pigmentation (production of 
zeaxanthin) to a reddish-orange pigmentation (production of astaxanthin) and chosen for 
25 further pigment analysis. No phenotypic change was noted for individual colonies of P. 
stewartii:: pCR-Blunt II-TOPO, so clones were randomly chosen for pigment analysis. 

Production of astaxanthin was confirmed by HPLC/MS. Carotenoids were 
extracted from cells harvested from 5 day old cultures of P. stewartii: : P TOPOcrtW or P. 
stewartii:: pCR-Blunt II-TOPO (25 ml) grown in LB with 50 ug/ml kanamycin by 
30 resuspending the washed cell pellet in 5 ml of acetone. Glass beads were added and the 
mixture was incubated for 60 minutes at room temperature in the dark with occasional 
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vortexing. The cells were separated from the acetone extract by centrifugation at 15,000 x 
g for 10 minutes. The acetone supernatant then was analyzed by HPLC/MS. 

A Waters 2790 LC system was used with two reverse-phase C30 specialty 
columns designed for carotenoid separation (YMCa Carotenoid S3m; 2.0 X 150 mm, 3 
5 mm particle size; Waters Corporation, PN CT99S03 1502WT)), in tandem. The columns 
were run at room temperature. A gradient of Mobile Phase A (0.1% acetic acid) and 
Mobile Phase B (90% acetone) was used to separate zeaxanthin and astaxanthin 
according to the following gradient timetable: 0 min (10%A, 90%B), 10 min (100%B), 12 
min (10%A, 90%B), 15 min (10%A, 90%B). Flow rate was 0.3 ml/min. Samples were 

10 stored at 20°C in an autosampler and a volume of 25 jaL was injected. A Waters 996 
Photodiode array detector, 350-550 nm, was used to detect zeaxanthin and astaxanthin. 
Under these chromatography conditions astaxanthin eluted at approximately 5.42-5.51 
min and zeaxanthin eluted at approximately 6.22-6.4 min. 

Carotenoid standards were used to identify the peaks. Astaxanthin was obtained 

1 5 from Sigma Chemical Co. (St. Louis, MO) and zeaxanthin was obtained from 

Extrasynthese (France). UV-Vis absorbtion spectra were used as diagnostic features for 
the carotenoids as were the molecular ion and fragmentation patterns generated using 
mass spectrometry. A positive-ion atmospheric pressure chemical ionization mass 
spectrometer was used; scan range, 400-800 m/z with a quadripole ion trap. * A 

20 representative HPLC chromatogram is shown in FIG 3, which confirms production of 
astaxanthin in P. stewartii transformed with the B. aurantiaca crtW gene. 

Example 4 - Simultaneous Production of CoQ-10 and (3S, Astaxanthin in 
a Microorganism: Although Phaffia rhodozyma is not capable of producing the 3S, 3'S 

25 isoform of astaxanthin, it is known to produce Coenzyme Q-10. This compound has been 
found to have particularly high value as a nutraceutical. The current invention is of 
particular value since R. sphaeroides is known to produce Coenzyme Q-10 and has been 
transformed with genes that, while novel, are nevertheless homologous to native genes in 
the MABP. Consequently, the described organism can be expected to simultaneously 

30 produce both Coenzyme Q-10 and (3S, 3'S)- ATX. This is the first described production 
of the production of both (3S, 3'S)-ATX and Coenzyme Q-10 in a single microbial host. 
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The identification of (3S, 3'S)-ATX can be accomplished as described by Maoka, 
T., et al. T Chromatoer. 318 :122-124 (1985). Briefly, this consists of extraction of the 
carotenoid pigments by contacting the biomass with a suitable organic solvent such as 
actetone or dichloromethane. The carotenoid extract is then dried under a stream of 

5 liquid nitrogen and resuspended in a solvent of n-hexane-dichloromethane-ethanol 

(48:16:0.6). The extract is applied to a Sumipax OA-2000 (particle size lOuM) 250 x 4 
mm I.D. (Sumitomo Chemicals, Osaka, Japan) chiral resolution HPLC column at a flow 
rate of 0.8 ml/min. Generally, the order of elution is expected to be (3R, 3'R)-ATX 
followed by (3R, 3'S; 3S, 3'R)-ATX followed by (3S, 3*S)-ATX. A similar separation is 

10 described in Maoka, T., et al. r, mp . Biochem. Physiol. 83B:121-124 (1986). Briefly, 
this consists of isolation of the carotenoid, derivitization to the dibenzoate form with 
benzoyl chloride and separation of the enantiomers using a Sumipax OA-2000 chiral 
resolution HPLC column. 

15 F.xamole 5 - Transformation of the multifunctional CIGPP synthase from 

ArnheoPlobus fulsidus into Rhodn hacter strain nnsr- with the crt Y and crtl genes 
from Pantoea stewartii inserted into the chromosome: The following protocol 
describes the generation of a (5-carotene producing strain of R. sphaeroides (ATCC 
35053), a facultative photoheterotroph, in which the ppsr gene was deleted by using the 

20 in-frame deletion procedure of Higuchi, R., et al, Nucleic Acid Res. 16: 735 1-7367 to 
generate strain AREG. Table 7 describes the strains and plasmids used in this example. 
PpsR is a transcription factor that is involved in the repression of photosysem gene 
expression under aerobic growth conditions. The region of the chromosome that included 
the native tspO, crtC, crtD, crtE and crtF genes of AREG were replaced by the lycopene 

25 (3 cyclase (crt Y) and phytoene desaturase (crtl) genes from P. stewartii using the 

procedure of Oh and Kaplan, Biochemistry 38:2688-2696 (1999); and Lenz, et al., L 
Bacteriology 176:4385-4393 (1994), to generate the strain AREG(A5:YI). Briefly, the 
crry and crt I genes were cloned into pLOl, a suicide vector for R. sphaeroides 
containing the Kanamycin resistance gene and the Bacillus subtilis sacB gene encoding 
30 sensitivity to sucrose. DNA fragments flanking the crtYI genes and identical in sequence 
to -500 bp internal fragments of the R. sphaeroides tspO and crtF genes were then cloned 
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into pLOl. These flanking DNA regions correspond to the desired region for insertion of 
the crtYI genes. Insertion of the crtYI genes in AREG was confirmed using PCR analyses 
and appropriate PCR primers specific to the crt YI genes as well as flanking regions of the 
R.sphaeroides genome. The crtYI {P. stewartii) insertion and tspO, crtC, crtD } crtE and 
5 crtF (R. sphaeroides) deletion resulted in the lack of native carotenoid production and a 
change in the pigmentation from red to green, confirming the insertion event. 

TABLE 7 



Description of Rhodobacter Strains and Plasmids 



Strain 


Description 


Major 
Carotenoid 
Produced 


Comments 


AREG 


ATCC 35053; 

ppsR regulatory mutant 


Sphaeroidenone 

(Native 

Carotenoid) 


Regulatory 
mutant 


AREG(A5:YI) 


CrtY and crtl genes of P. 
stewartii reolaced 5 host 
genes (tspO, crtC, crtD, 
crtE and crtF) on 
chromosome 


None 


P-carotene 
bio synthetic 
genes placed in 
chromosome. No 
carotenoid 
production 
because of crtE 
deletion 


AREG(A5:YI)::pP 
Ctrl 


Control vector introduced 
into AREG(A5:YI) host 


None 


Control vector 
contains rrnB 
promoter but no 
biosynthetic 
genes 


AREG(A5:YI)::pP 
gps 


gps gene of A. fulgidus 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) host 


p-Carotene 


gps gene on 
plasmid 

complements crtE 

deletion. 

Complete 

pathway for P- 

carotene 

production 
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Strain 


Description 


Major 
Carotenoid 
Produced 


Comments 


AR£G(A5:YI) 
( AA:sds") 


gps gene of A. fulgidus 
replaced crtA host gene on 
chromosome of 
AREG(A5:YI) host 


P-Carotene 


gps gene inserted 
into genome 
complements crtE 
deletion. 1 
Complete 
pathway for p- 
carotene 
production 


AREG(A5:YI) 

(AA:gps) 

::pPWZ 


crtW and crtZ genes 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) (AA:gps) 
host 


Astaxanthin 


crtWand crtZ 
genes convert P- 
carotene into 
astaxanthin 


AREG(A5:YI) 

(AA:gps) 

•dPsdsWZ 


gps, crtWand crtZ genes 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) (AA:gps) 
host 


Astaxanthin 


Additional copies 

gene on plasmid 
increases 
production of 
astaxanthin 


Plasmids 


Genetic elements inserted 






PBBR1MCS2 


None 






r r Ctrl 


rrnR nromoter 






PPgps 


rrnB promoter, A. fulgidus 
ZPS 






PPWZ 


rrnB promoter, P. stewartii 
crtZ, 

B. aurantiacum crtW 






PPgpsWZ 


rrnB promoter, A. fulgidus 

P. stewartii crtZ, 

B. aurantiacum crtW 







The pPctrl vector was constructed by inserting a copy of the R. sphaeroides rrnB 
promoter (GenBank Accession # X53854; rmBP) into the vector P BBR1MCS2 (GenBank 
Accession # U23751). The rrnB promoter was isolated from the vector pTEX24 (S. 
Kaplan) by a BamHl restriction enzyme digest, which released the promoter as a 363 bp 
fragment. This fragment was gel purified from a 2% Tris-acetate-EDTA (TAE) agarose 
gel. To prepare the pBBRlMCS2 vector for ligation, it also was digested with BamUl 
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and the enzyme heat inactivated at 80°C for 20 minutes. The digested vector was 
dephosphorylated with shrimp alkaline phosphatase (Roche Molecular Biochemicals, 
Indianapolis, IN), and gel purified from a 1% TAE-agarose gel. The prepared vector and 
the rrnB fragment were ligated using T4 DNA ligase at 1 6°C for 1 6 hours to generate the 
plasmid pPctrl. One |iL of ligation reaction was used to electroporate 40 \iL of £. coli 
ElectromaxTM DH10BTM cells (Life Technologies, Inc., Rockville, MD)." 

Electroporated cells were plated on LB media containing 25 fig/mL of kanamycin 
(LBK). pPctrl DNA was isolated from cultures of single colonies and was digested with 
Hind III to confirm the presence of a single insertion of the rrnB promoter. The sequence 
of pPctrl also was confirmed by DNA sequencing. 

The multifunctional GGPP synthase (gps) gene from A.fulgidus (GenBank 
Accession No. AF 120272) was cloned into the multiple cloning site of pPctrl to generate 
the construct pPgps. 

Electrocompetent AREG(A5:YI) cells were prepared as follows: 5 ml cultures 
were inoculated using Sistrom's media supplemented with trace elements, vitamins 
(O'Gara, et al., J. Bacteriol. 180:4044-4050 (1988); Cohen-Bazire, et al. J. Cell. Comp. 
Physiol. 49:25-68 (1957)) and 0.4% glucose as a carbon source, and grown overnight at 
30°C with shaking. This culture was diluted 1/100 in 300 mL of the same media and 
grown to an OD 6 60 of 0.5-0.8. The cells were chilled on ice for 10 minutes and then 
centrifuged for 6 minutes at 7,500 g. The supernatant was discarded and the cell pellet 
was resuspended in ice-cold 10% glycerol at half of the original volume. The cells were 
pelleted by centrifiigation for 6 minutes at 7,500 g. The supernatant was again discarded 
and cells were resuspended in ice cold 10% glycerol at one quarter of the original volume. 
The last centrifugation and resuspension steps were repeated, followed by centrifiigation 
for 6 minutes at 7,500 g. The supernatant was decanted and the cells resuspended in the 
small volume of glycerol that did not drain out. Additional ice-cold 10% glycerol was 
added to resuspend the cells if necessary. Forty [iL of the resuspended cells was used in a 
test electroporation (see below) to determine if the cells needed to be concentrated by 
centrifugation or diluted with 10% ice-cold glycerol. Time constants of 8.5-9.0 resulted 
in good transformation efficiencies. Once an acceptable time constant was achieved, cells 
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were aliquoted into cold microfuge tubes and stored at -80°C. All water used for media 
and glycerol was 18 Mohm or higher. 

Electroporation of AREG(A5:YI) was carried out as follows. One |j.L of pPgps or 
pPctrl vector DNA was gently mixed into 40 jxL of AREG(A5:YI) electrocompetent cells, 
5 which then were transferred to an electroporation cuvette with a 0.2 cM electrode gap. 
Electroporations were conducted using a Biorad Gene Pulser II (Biorad, Hercules, CA) 
with settings at 2.5 kV of potential, 400 ohms of resistance, and 25 \xF of capacitance. 
Cells were recovered in 400 )jJL SOC media at 30°C for 6-16 hours. The cells were then 
plated, 200 )iL per plate, on LB medium containing 50 ^g/ml kanamycin and incubated at 

10 30°C for 5-6 days. 

After incubation, greenish colonies were observed on plates of AREG(A5:YI) 
transformed with pPctrl plasmid DNA. The colonies that appeared on plates of 
AREG(A5:YI) transformed with pPgps plasmid DNA appeared yellow. The yellow 
pigmentation was indicative of p-carotene production in AREG(A5:YI) expressing the A. 

1 5 fulgidus gps gene from pPgps. 

Single yellow colonies were grown up in Sistrom's liquid media supplemented 
with vitamins, trace elements and 0.4% glucose as well as 50 jig/ml kanamycin, at 30°C 
with shaking for 24-48 hours. Carotenoids were extracted and subjected to LCMS 
analysis as described above. Under the chromatography conditions used, P-carotene 

20 eluted at approximately 13.87-14.2 min. P-carotene standard (Sigma chemical, St. Louis, 
MO) was used to identify the peaks. The UV-Vis absorption spectra and the retention 
time using HPLC were used as diagnostic features for p-carotene identification in 
AREG(A5: YI) transformed with pPgps DNA, as well as the molecular ion and 
fragmentation patterns generated during mass spectrometry. Thus, the production of P- 

25 carotene was confirmed in AREG(A5:YI) expressing the A. fulgidus gps gene from pPgps. 



Example 6 - Transformation of the B-carotene C-4 ketolase (crtW) gene from 
Brevumdimonas aurantiacum and B-carotene hydroxylase (crtZ) from P. stewartii 
into the AREG(A5:Y1) strain of Rhodobacter with the gps gene from Archeoglobus 
30 fulgidus inserted into the chromosome: The following protocol describes the 
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generation of an astaxanthin producing strain of R. sphaeroides using AREG(A5:YI), 
described above. See also Table 7 for further description of the strains and plasmids that 
were used in this example. Using the gene insertion method described by Higuchi, R., et 
al, Nucleic Acid Res. 16: 7351-7367, the criA gene of AREG(A5:YI) was replaced by the 

5 gps gene from A. fulgidus to generate the strain AREG(A5:YI)(AA:gps). 

Electrocompetent cells AREG(A5:YI)(AA:gps) were generated as described above. 

The construct pPgpsWZ was produced by cloning the crtW gene from B. 
aurantiacum, the crtZ gene from P.stewartii, and the gps gene from A fulgidus into the 
pPctrl plasmid using appropriate restriction enzymes. The construct pPWZ was produced 

10 by cloning the crtW gene from B. aurantiacum and the crtZ gene from P.stewartii into the 
pPctrl plasmid using appropriate restriction enzymes. 

The pPWZ or pPgpsWZ constructs were electroporated into electrocompetent 
AREG(A5:YI)(AA:gps) as described earlier to generate AREG(A5:YI)(AA:gps)::pPWZ or 
AREG(A5:YI)(AA:gps)::pPgpsWZ, respectively. Transformation mixtures were plated 

15 out onto LB plates containing 50 jig/ml kanamycin. PCR analyses using PCR primers 

specific for crtZ were used to confirm the presence of the pPWZ or pPgpsWZ plasmids in 
AREG(A5:Yl)(AA:gps). 

Single colonies of AREG(A5:YI)(AA:gps)::pPWZ or 
AREG(A5:YI)(AA:gps)::pPgpsWZ were grown up in media supplemented with 50 (ig/ml 

20 kanamycin as described earlier. Cell pellets were washed with distilled water and then 

carotenoids were extracted using acetone:methanol (7:2) at 30°C for 30 mins with shaking 
(;"; at 225 rpm. Carotenoid analysis was performed using LCMS analysis described above. 

The UV-Vis absorption spectra and the retention time using HPLC were used as 
diagnostic features for astaxanthin identification in AREG(A5:YI)(AA:gps)::pPWZ and 

25 ARJEG(A5:YI)(AA:gps)::pPgpsWZ, as well as the molecular ion and fragmentation 
patterns generated during mass spectrometry. The production of astaxanthin was 
confirmed in both AREG(A5:YI)(AA:gps)::pPWZ and AREG(A5:Yl)(AA:gps)::pPgpsWZ. 
Increased astaxanthin production was observed in AREG(A5:YI)(AA:gps)::pPgpsWZ. 
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Example 7: Cloning and sequencing of a novel multifunctional 
Geranylgeranvl pyrophosphate synthase gene (zps) from Sulfolobus shibatae: 

Degenerate primer sequences MFGGPP1 (5 ' CC A YG A YG A Y ATW ATGGA3 ' , SEQ ID 
NO:40) and MFGGPP2 (5 ' YTTYTTVCCYTYCCTAAT3 SEQ ID NO:41) were 
5 designed based on conserved sequences in gps gene sequences from Sulfolobus 
solfotaricus and Sulfolobus acidocaldarius and synthesized by Integrated DNA 
Technologies (Coralville, IA). PCR was performed in a mastercycler gradient machine 
(Eppendorf) with genomic DNA from S. shibatae (ATCC Accession No. 5 1 178, lot # 
1 162977). Reaction conditions included five minutes at 96°C, followed by 30 cycles of 
10 denaturation at 94°C for 30 sec, annealing at 50 4- 10°C for 60 sec, and extension at 

72°C for 90 sec, and a final 72°C incubation for 10 min. An approximately 500-bp PCR 
product was obtained and cloned into the vector pC-Buntll-TOPO (Invitrogen Corp. 
Carlsbad, CA). 

Independent clones were sequenced using the universal Ml 3 forward and reverse 

15 primers. DNA sequencing was carried out at the AGAC, University of Minnesota, 

St. Paul, MN. DNA sequence analysis of this PCR product indicated similarity to the gps 
genes from S. sulfotaricus and S. acidocaldarius. The Universal Genome Walker kit 
(Clontech) was used to obtain more of the gps gene sequence flanking the original PCR 
product from S. shibatae. Primers were synthesized based on the partial sequence and 

20 used for genome walking experiments. 

The following strategy was used to completely sequence the S. shibatae gps gene. 
The ERWCRTS homolog was observed upstream of the S. sulfotaricus gps gene. The 
UDP-A-acetylglucosamine — Dolichyl-phosphate-N-acetylglucosamine 
phosphotransferase gene was present downstream of the gps gene in both S. sulfotaricus 

25 and S. acidocaldarius. Primers were designed based on the sequence of the two genes 
SsDolidn (5'ACAGCGTTGGACACTCAG 3', SEQ ID NO:42) and SsERCRTup (5' 
GCGTCGATAATGGAAGTGAG 3', SEQ ID NO:43) of the gps gene. An approximately 
2 kb PCR product was amplified using the SsDolidn and SsERCRTup primers and 
genomic DNA from S. shibatae. This PCR product was cloned into the vector pC-Buntll- 

30 TOPO as described above and sequenced using the universal Ml 3 forward and reverse 
primers. The nucleotide sequence of the gps gene from S. shibatae is presented in SEQ 
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ID NO: 44, and the amino acid sequence of the protein encoded by the gps gene is 
presented in SEQ ID NO:45. 

OTHER EMBODIMENTS 

5 It is to be understood that while the invention has been described in conjunction 

with the detailed description thereof, the foregoing description is intended to illustrate and 
not limit the scope of the invention, which is defined by the scope of the appended claims. 
Other aspects, advantages, and modifications are within the scope of the following 
claims. 
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WHAT IS CLAIMED IS: 

1. An isolated nucleic acid having at least 76% sequence identity to the nucleotide 
sequence of SEQ ID NO:l or to a fragment of SEQ ID NO:l at least 33 contiguous 
nucleotides in length. 

5 

2. The isolated nucleic acid of claim 1, said nucleic acid having at least 80% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1. 

3. The isolated nucleic acid of claim 1, said nucleic acid having at least 85% sequence 
1 0 identity to the nucleotide sequence of SEQ ID NO: 1 . 

4. The isolated nucleic acid of claim 1, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:l. 

15 5. The isolated nucleic acid of claim 1, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:l. 

6. An expression vector comprising the nucleic acid of claim 1 operably linked to an 
expression control element. 

20 

7. An isolated nucleic acid encoding a zeaxanthin glucosyl transferase polypeptide at 
least 75% identical to the amino acid sequence of SEQ ID NO:2. 

8. An isolated nucleic acid having at least 78% sequence identity to the nucleotide 
25 sequence of SEQ ID NO:3 or to a fragment of SEQ ID NO:3 at least 32 contiguous 

nucleotides in length. 

9. The isolated nucleic acid of claim 8, said nucleic acid having at least 80% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

30 
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10. The isolated nucleic acid of claim 8, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

1 1 . The isolated nucleic acid of claim 8, said nucleic acid having at least 90% sequence • 
identity to the nucleotide sequence of SEQ ID NO:3. 

12. The isolated nucleic acid of claim 8, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

13. An expression vector comprising the nucleic acid of claim 8 operably linked to an 
expression control element. 

14. An isolated nucleic acid encoding a lycopene [3-cyclase polypeptide at least 83% 
identical to the amino acid sequence of SEQ ID NO:4. 

15. An isolated nucleic acid having at least 81% sequence identity to the nucleotide 
sequence of SEQ ID NO: 5 or to a fragment of SEQ ID NO: 5 at least 60 contiguous 
nucleotides in length. 

16. The isolated nucleic acid of claim 15, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

17. The isolated nucleic acid of claim 15, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

18. The isolated nucleic acid of claim 15, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

19. An expression vector comprising the nucleic acid of claim 15 operably linked to an 
expression control element. 
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20. An isolated nucleic acid encoding a geranylgeranyl pyrophosphate synthase 
polypeptide at least 85% identical to the amino acid sequence of SEQ ID NO:6. 

21. An isolated nucleic acid having at least 82% sequence identity to the nucleotide 

5 sequence of SEQ ID NO:7 or to a fragment of SEQ ID NO:7 at least 30 contiguous 

nucleotides in length. 

22. The isolated nucleic acid of claim 21, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:7. 

10 

23. The isolated nucleic acid of claim 21, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:7. 

24. The isolated nucleic acid of claim 21, said nucleic acid having at least 95% sequence 
1 5 identity to the nucleotide sequence of SEQ ID NO: 7. 

25. An expression vector comprising the nucleic acid of claim 21 operably linked to an 
expression control element. 

20 26. An isolated nucleic acid encoding a phytoene desaturase polypeptide at least 90% 
identical to the amino acid sequence of SEQ ID NO:8. 

27. An isolated nucleic acid having at least 82% sequence identity to the nucleotide 
sequence of SEQ ID NO:9 or to a fragment of SEQ ID NO:9 at least 23 contiguous 

25 nucleotides in length. 

28. The isolated nucleic acid of claim 27, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 

30 29. The isolated nucleic acid of claim 27, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 
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30. The isolated nucleic acid of claim 27, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 

5 3 1 . An expression vector comprising the nucleic acid of claim 27 operably linked to an 
expression control element. 

32. An isolated nucleic acid encoding a phytoene synthase polypeptide at least 89% 
identical to the amino acid sequence of SEQ ID NO: 10. 

10 

33. An isolated nucleic acid having at least 85% sequence identity to the nucleotide 
sequence of SEQ ID NO: 1 1 or to a fragment of SEQ ID NO: 1 1 at least 36 contiguous 
nucleotides in length. 



15 34. The isolated nucleic acid of claim 33, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO: 11. 

35. The isolated nucleic acid of claim 33, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO: 11. 

20 

36. The isolated nucleic acid of claim 33, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1 1 . 

37. An expression vector comprising the nucleic acid of claim 33 operably linked to an 
25 expression control element. 

38. An isolated nucleic acid encoding a P-carotene hydroxylase polypeptide at least 90% 
identical to the amino acid sequence of SEQ ID NO: 12. 

30 39. Membranous bacteria comprising at least one exogenous nucleic acid encoding 

phytoene desaturase, lycopene p-cyclase, p-carotene hydroxylase, and P-carotene C4 
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oxygenase, wherein expression of said at least one exogenous nucleic acid produces 
detectable amounts of astaxanthin in said membranous bacteria. 



40. The membranous bacteria of claim 39, wherein the amino acid sequence of said 
5 phytoene desaturase is at least 90% identical to the amino acid sequence of SEQ ID 

NO:8. 



41 . The membranous bacteria of claim 39, wherein the amino acid sequence of said 
lycopene P-cyclase is at least 83% identical to the amino acid sequence of SEQ ID 

10 NO:4. 

42. The membranous bacteria of claim 39, wherein the amino acid sequence of said 
P-carotene hydroxylase is at least 90% identical to the amino acid sequence of SEQ 
ID NO: 12. 

15 

43. The membranous bacteria of claim 39, wherein said membranous bacteria further 
comprises an exogenous nucleic acid encoding geranylgeranyl pyrophosphate 
synthase. 



20 44. The membranous bacteria of claim 39, wherein said membranous bacteria lacks 
endogenous bacteriochlorophyll biosynthesis. 

45. The membranous bacteria of claim 43, wherein said exogenous nucleic acid encodes a 
multifunctional geranylgeranyl pyrophosphate synthase. 

25 

46. The membranous bacteria of claim 45, wherein the amino acid sequence of said 
multifunctional geranylgeranyl pyrophosphate synthase is at least 90% identical to the 
amino acid sequence of SEQ ID NO:45. 
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47. The membranous bacteria of claim 39, wherein the amino acid sequence of said 
p-carotene C4 oxygenase is at least 80% identical to the amino acid sequence of SEQ 
ID NO.39. 

48. The membranous bacteria of claim 39, wherein said membranous bacteria further 
comprise an exogenous nucleic acid encoding phytoene synthase. 

49. The membranous bacteria of claim 48, wherein the amino acid sequence of said 
phytoene synthase is at least 89% identical to the amino acid sequence of SEQ ID 
NO:10. 

50. The membranous bacteria of claim 39, wherein said membranous bacteria are a 
Rhodobacter species. 

5 1 . Membranous bacteria, said membranous bacteria comprising an exogenous nucleic 
acid encoding a phytoene desaturase having an amino acid sequence at least 90% 
identical to the amino acid sequence of SEQ ID NO: 8, and wherein said membranous 
bacteria produces detectable amounts of lycopene. 

52. The membranous bacteria of claim 51, wherein said membranous bacteria further 
comprise a lycopene p-cyclase, and wherein said membranous bacteria produce 
detectable amounts of (3-carotene. 

53. The membranous bacteria of claim 52, wherein said membranous bacteria further 
comprise a P-carotene hydroxylase, and wherein said membranous bacteria produce 
detectable amounts of zeaxanthin. 

54. Membranous bacteria comprising at least one exogenous nucleic acid encoding 
phytoene desaturase, lycopene p-cyclase, and P-carotene C4 oxygenase, wherein 
expression of said at least one exogenous nucleic acid produces detectable amounts of 
canthaxanthin in said membranous bacteria. 
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55. A composition comprising an engineered Rhodobacter cell, wherein said cell 
produces a detectable amount of astaxanthin or canthaxanthin. 

5 56. The composition of claim 55, wherein said engineered Rhodobacter cell comprises at 
least one exogenous nucleic acid encoding phytoene desaturase, lycopene p-cyclase, 
P-carotene hydroxylase, and P-carotene C4 oxygenase. 

57. The composition of claim 55, wherein said composition is formulated for aquaculture. 

10 

58. The composition of claim 57, wherein said composition pigments the flesh of fish or 
the carapace of crustaceans after ingestion. 

59. The composition of claim 55, wherein said composition is formulated for human 
1 5 consumption. 

60. The composition of claim 55, wherein said composition is formulated as an animal 
feed. 

20 61 . The composition of claim 60, wherein said animal feed is formulated for consumption 
by chickens, turkeys, cattle, swine, or sheep. 

62. A method of making a nutraceutical, said method comprising extracting carotenoids 
from an engineered Rhodobacter cell, said engineered Rhodobacter cell comprising at 

25 least one exogenous nucleic acid encoding phytoene desaturase, lycopene P-cyclase, 

p-carotene hydroxylase, and p-carotene C4 oxygenase, and wherein said Rhodobacter 
cell produces detectable amounts of astaxanthin. 

63. Membranous bacteria, said membranous bacteria comprising an exogenous nucleic 
30 acid encoding a lycopene P-cyclase having an amino acid sequence at least 83% 

identical to the amino acid sequence of SEQ ID NO:4. 
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64. The membranous bacteria of claim 63, said membranous bacteria further comprising a 
phytoene desaturase, wherein said membranous bacteria produces detectable amounts 
of P-carotene. 

5 

65. The membranous bacteria of claim 64, said membranous bacteria further comprising a 
p-carotene hydroxylase, wherein said bacteria produces detectable amounts of 
zeaxanthin. 

10 66. Membranous bacteria, said membranous bacteria comprising a P-carotene 

hydroxylase having an amino acid sequence at least 90% identical to the amino acid 
sequence of SEQ ID NO: 12. 

67. The membranous bacteria of claim 66, said membranous bacteria further comprising a 
1 5 lycopene P-cyclase, and wherein said membranous bacteria produces detectable 

amounts of zeaxanthin. 

68. The membranous bacteria of claim 67, said membranous bacteria further comprising a 
phytoene desaturase, wherein said membranous bacteria produces detectable amounts 

20 of p-carotene. 

69. Membranous bacteria, said bacteria lacking an endogenous nucleic acid encoding a 
farnesyl pyrophosphate synthase, and wherein said bacteria produce detectable 
amounts of carotenoids. 

25 

70. The membranous bacteria of claim 69, wherein said bacteria comprise an exogenous 
nucleic acid encoding a multifunctional geranylgeranyl pyrophosphate synthase. 

71 . The membranous bacteria of claim 70, wherein the amino acid sequence of said 

30 multifunctional geranylgeranyl pyrophosphate synthase is at least 90% identical to the 

amino acid sequence of SEQ ID NO:45. 
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72. The membranous bacteria of claim 69, wherein said membranous bacteria are a 
species of Rhodobacier. 

5 73. An isolated nucleic acid having at least 60% sequence identity to the nucleotide 

sequences of SEQ ID NO:38, or to a fragment of the nucleic acid of SEQ ID NO:38 at 
least 15 contiguous nucleotides in length. 

74. The isolated nucleic acid of claim 73, said nucleic acid having at least 80% sequence 

1 0 identity to the nucleotide sequences of SEQ ID NO:38, or to a fragment of the nucleic 

acid of SEQ ID NO:38 at least 15 contiguous nucleotides in length. 

75. The isolated nucleic acid of claim 73, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequences of SEQ ID NO:38, or to a fragment of the nucleic 

1 5 acid of SEQ ID NO:38 at least 15 contiguous nucleotides in length. 

76. The isolated nucleic acid of claim 73, wherein said nucleic acid encodes a [3-carotene 
C4 oxygenase. 

20 77. Membranous bacteria comprising an exogenous nucleic acid encoding a p-carotene 

C4 oxygenase, said P-carotene oxygenase having an amino acid sequence at least 80% 
identical to the amino acid sequence of SEQ ID NO:39. 

78. A host cell comprising an exogenous nucleic acid, wherein the exogenous nucleic acid 
25 comprises a nucleic acid sequence encoding one or more polypeptides that catalyze 

the formation of (3S, 3'S) astaxanthin, wherein the host cell produces CoQ-10 and 
(3S, 3'S) astaxanthin. 

79. A method of making CoQ-10 and (3S, 3'S) astaxanthin at substantially the same time, 
30 the method comprising transforming a host cell with a nucleic acid, wherein the 

nucleic acid comprises a nucleic acid sequence that encodes one or more 
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polypeptides, wherein the polypeptides catalyze the formation of (3S, 3'S) 
astaxanthin; and culturing the host cell under conditions that allow for the production 
of (3S, 3'S) astaxanthin and CoQ-10. 

5 80. The method of claim 79, additionally comprising transforming the host cell with at 
least one exogenous nucleic acid, the exogenous nucleic acid encoding one or more 
polypeptides, wherein the polypeptides catalyze the formation of CoQ-10. 

81. An isolated nucleic acid having a nucleotide sequence selected from the group 

10 consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID 

NO:9, SEQ ID NO:l 1, SEQ ID NO:38, and SEQ ID NO:44. 

82. An isolated nucleic acid having at least 90% sequence identity to the nucleotide 
sequences of SEQ ID NO:44, or to a fragment of the nucleic acid of SEQ ID NO:44 at 

15 least 60 contiguous nucleotides in length. 

83. A method of making geranylgeranyl pyrophosphate, said method comprising 
contacting isopentenyl pyrophosphate and dimethylallyl pyrophosphate with a 
polypeptide encoded by the isolated nucleic acid of claim 82. 

20 

84. A method of making geranylgeranyl pyrophosphate, said method comprising 
contacting famesyl pyrophosphate and isopentenyl pyrophosphate with a polypeptide 
encoded by the isolated nucleic acid of claim 1 5 or the polypeptide of claim 20. 

25 85. A method of making p-carotene, said method comprising contacting lycopene with a 
polypeptide encoded by the isolated nucleic acid of claim 8 or the polypeptide of 
claim 14. 

86. A method of making lycopene, said method comprising contacting phytoene with a 
30 polypeptide encoded by the isolated nucleic acid of claim 21 or the polypeptide of 

claim 26. 
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87. A method of making phytoene, said method comprising contacting geranylgeranyl 
pyrophosphate with a polypeptide encoded by the isolated nucleic acid of claim 27 or 
the polypeptide of claim 32. 

5 

88. A method of making zeaxanthin, said method comprising contacting P-carotene with a 
polypeptide encoded by the isolated nucleic acid of claim 33 or the polypeptide of 
claim 38. 



10 89. A method of making canthaxanthin, said method comprising contacting p-carotene 
with a polypeptide encoded by the isolated nucleic acid of claim 73 or a polypeptide 
having an amino acid sequence at least 80% identical to the amino acid sequence of 
SEQ ID NO:39. 



15 90. A method of making astaxanthin, said method comprising contacting canthaxanthin 
with a polypeptide encoded by the isolated nucleic acid sequence of claim 33 or the 
polypeptide of claim 38. 

9 1 . A method of making astaxanthin, said method comprising contacting zeaxanthin with 
20 a polypeptide encoded by the isolated nucleic acid sequence of claim 73 or a 

polypeptide having an amino acid sequence at least 80% identical to the amino acid 
sequence of SEQ ID NO:39. 
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SEQUENCE LISTING 

<110> Cargill, Incorporated 
<120> Carotenoid Biosynthesis 

<130> 12794-004WO1 

<150> US 60/288, 984 
<151> 2001-05-04 

<150> US 60/264, 329 
<151> 2001-01-26 

<160> 47 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 1296 

<212> DNA 

<213> Pantoea stewartii 



60 



<400> 1 

atgagccatt ttgcggtgat cgcaccgccc tttttcagcc atgttcgcgc tctgcaaaac 
cttgctcagg aattagtggc ccgcggtcat cgtgttacgt tttttcagca acatgactgc 120 

180 
240 
300 
360 
420 
480 



aaagcgctgg taacgggcag cgatatcgga ttccagaccg tcggactgca aacgcatcct 
cccggttcct tatcgcacct gctgcacctg gccgcgcacc cactcggacc ctcgatgtta 
cgactgatca atgaaatggc acgtaccagc gatatgcttt gccgggaact gcccgccgct 
tttcatgcgt tgcagataga gggcgtgatc gttgatcaaa tggagccggc aggtgcagta 
gtcgcagaag cgtcaggtct gccgtttgtt tcggtggcct gcgcgctgcc gctcaaccgc 
gaaccgggtt tgcctctggc ggtgatgcct ttcgagtacg gcaccagcga tgcggctcgg 
gaacgctata ccaccagcga aaaaatttat gactggctga tgcgacgtca cgatcgtgtg 540 
atcgcgcatc atgcatgcag aatgggttta gccccgcgtg aaaaactgca tcattgtttt 600 
tctccactgg cacaaatcag ccagttgatc cccgaactgg attttccccg caaagcgctg 660 
ccagactgct ttcatgcggt tggaccgtta cggcaacccc aggggacgcc ggggtcatca 720 

7 8 0 
840 
900 
960 
1020 
1080 
1140 
1200 



acttcttatt ttccgtcccc ggacaaaccc cgtatttttg cctcgctggg caccctgcag 
ggacatcgtt atggcctgtt caggaccatc gccaaagcct gcgaagaggt ggatgcgcag 
ttactgttgg cacactgtgg cggcctctca gccacgcagg caggtgaact ggcccggggc 
ggggacattc aggttgtgga ttttgccgat caatccgcag cactttcaca ggcacagttg 
acaatcacac atggtgggat gaatacggta ctggacgcta ttgcttcccg cacaccgcta 
ctggcgctgc cgctggcatt tgatcaacct ggcgtggcat cacgaattgt ttatcatggc 
atcggcaagc gtgcgtctcg gtttactacc agccatgcgc tggcgcggca gattcgatcg 
ctgctgacta acaccgatta cccgcagcgt atgacaaaaa ttcaggccgc attgcgtctg 
gcaggcggca caccagccgc cgccgatatt gttgaacagg cgatgcggac ctgtcagcca 1260 
gtactcagtg ggcaggatta tgcaaccgca ctatga 1296 

<210> 2 

<211> 431 

<212> PRT 

<213> Pantoea stewartii 

<400> 2 

Met Ser His Phe Ala Val He Ala Pro Pro Phe Phe Ser His Val Arg 
I 5 10 15 
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Ala Leu Gin Asn 
20 

Thr Phe Phe Gin 
35 

lie Gly Phe Gin 
50 

Ser His Leu Leu 
65 

Arg Leu lie Asn 

Leu Pro Ala Ala 
100 

Gin Met Glu Pro 
115 

Phe Val Ser Val 
130 

Pro Leu Ala Val 
145 

Glu Arg Tyr Thr 

His Asp Arg Val 
180 

Arg Glu Lys Leu 
195 

Leu lie Pro Glu 
210 

His Ala Val Gly 
225 

Thr Ser Tyr Phe 

Gly Thr Leu Gin 
260 

Ala Cys Glu Glu 
275 

Leu Ser Ala Thr 
290 

Val Val Asp Phe 
305 

Thr lie Thr His 

Arg Thr Pro Leu 
34 0 

Ala Ser Arg lie 
355 

Thr Thr Ser His 
370 

Thr Asp Tyr Pro 
385 

Ala Gly Gly Thr 

Thr Cys Gin Pro 
420 



Leu Ala Gin Glu 

Gin His Asp Cys 
40 

Thr Val Gly Leu 
55 

His Leu Ala Ala 
70 

Glu Met Ala Arg 
85 

Phe His Ala Leu 

Ala Gly Ala Val 
120 

Ala Cys Ala Leu 
135 

Met Pro Phe Glu 
150 

Thr Ser Glu Lys 
165 

He Ala His His 

His His Cys Phe 
200 

Leu Asp Phe Pro 
215 

Pro Leu Arg Gin 
230 

Pro Ser Pro Asp 
245 

Gly His Arg Tyr 

Val Asp Ala Gin 
280 

Gin Ala Gly Glu 
295 

Ala Asp Gin Ser 
310 

Gly Gly Met Asn 
3 25 

Leu Ala Leu Pro 

Val Tyr His Gly 
360 

Ala Leu Ala Arg 
375 

Gin Arg Met Thr 
390 

Pro Ala Ala Ala 
405 

Val Leu Ser Gly 



Leu Val Ala Arg 
25 

Lys Ala Leu Val 

Gin Thr His Pro 
60 

His Pro Leu Gly 
75 

Thr Ser Asp Met 
90 

Gin He Glu Gly 
105 

Val Ala Glu Ala 

Pro Leu Asn Arg 
140 

Tyr Gly Thr Ser 
155 

He Tyr Asp Trp 
170 

Ala Cys Arg Met 
185 

Ser Pro Leu Ala 

Arg Lys Ala Leu 
220 

Pro Gin Gly Thr 
235 

Lys Pro Arg He 
250 

Gly Leu Phe Arg 
265 

Leu Leu Leu Ala 

Leu Ala Arg Gly 
300 

Ala Ala Leu Ser 
315 

Thr Val Leu Asp 
330 

Leu Ala Phe Asp 
345 

He Gly Lys Arg 

Gin He Arg Ser 
380 

Lys He Gin Ala 
395 

Asp He Val Glu 
410 

Gin Asp Tyr Ala 
425 



Gly His Arg Val 
30 

Thr Gly Ser Asp 
45 

Pro Gly Ser Leu 

Pro Ser Met Leu 
80 

Leu Cys Arg Glu 
95 

Val He Val Asp 
110 

Ser Gly Leu Pro 
125 

Glu Pro Gly Leu 

Asp Ala Ala Arg 
160 

Leu Met Arg Arg 
175 

Gly Leu Ala Pro 
190 

Gin He Ser Gin 
205 

Pro Asp Cys Phe 

Pro Gly Ser Ser 
240 

Phe Ala Ser Leu 
255 

Thr He Ala Lys 
270 

His Cys Gly Gly 
285 

Gly Asp He Gin 

Gin Ala Gin Leu 
320 

Ala He Ala Ser 
335 

Gin Pro Gly Val 
350 

Ala Ser Arg Phe 
365 

Leu Leu Thr Asn 

Ala Leu Arg Leu 
400 

Gin Ala Met Arg 
415 

Thr Ala Leu 
430 
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<21i> 1149 

<212> DNA 
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<400> 3 

atgcaaccgc actatgatct cattctggtc ggtgccggtc tggctaatgg ccttatcgcg 60 

ctccggcttc agcaacagca tccggatatg cggatcttgc ttattgaggc gggtcctgag 120 

gcgggaggga accatacctg gtcctttcac gaagaggatt taacgctgaa tcagcatcgc 180 

tggatagcgc cgcttgtggt ccatcactgg cccgactacc aggttcgttt cccccaacgc 240 

cgtcgccatg tgaacagtgg ctactactgc gtgacctccc ggcatttcgc cgggatactc 300 

cggcaacagt ttggacaaca tttatggctg cataccgcgg tttcagccgt tcatgctgaa 360 

tcggtccagt tagcggatgg ccggattatt catgccagta cagtgatcga cggacggggt 420 

tacacgcctg attctgcact acgcgtagga ttccaggcat ttatcggtca ggagtggcaa 480 

ctgagcgcgc cgcatggttt atcgtcaccg attatcatgg atgcgacggt cgatcagcaa 540 

aatggctacc gctttgttta taccctgccg ctttccgcaa ccgcactgct gatcgaagac 600 

acacactaca ttgacaaggc taatcttcag gccgaacggg cgcgtcagaa cattcgcgat 660 

tatgctgcgc gacagggttg gccgttacag acgttgctgc gggaagaaca gggtgcattg 720 

cccattacgt taacgggcga taatcgtcag ttttggcaac agcaaccgca agcctgtagc 780 

ggattacgcg ccgggctgtt tcatccgaca accggctact ccctaccgct cgcggtggcg 840 

ctggccgatc gtctcagcgc gctggatgtg tttacctctt cctctgttca ccagacgatt 900 

gctcactttg cccagcaacg ttggcagcaa caggggtttt tccgcatgct gaatcgcatg 960 

ttgtttttag ccggaccggc cgagtcacgc tggcgtgtga tgcagcgttt ctatggctta 1020 

cccgaggatt tgattgcccg cttttatgcg ggaaaactca ccgtgaccga tcggctacgc 1080 

attctgagcg gcaagccgcc cgttcccgtt ttcgcggcat tgcaggcaat tatgacgact 1140 

catcgttga 1149 



<210> 4 
<211> 382 
<212> PRT 

<213> Pantoea stewartii 



<400> 4 



Met 


Gin 


Pro 


His 


Tyr 


Asp 


Leu 


lie 


Leu 


Val 


Gly 


Ala 


Gly 


Leu 


Ala 


Asn 


1 








5 










10 










15 




Gly 


Leu 


He 


Ala 


Leu 


Arg 


Leu 


Gin 


Gin 


Gin 


His 


Pro 


Asp 


Met 


Arg 


He 








20 










25 










30 






Leu 


Leu 


He 


Glu 


Ala 


Gly 


Pro 


Glu 


Ala 


Gly 


Gly 


Asn 


His 


Thr 


Trp 


Ser 






35 










40 










45 








Phe 


His 


Glu 


Glu 


Asp 


Leu 


Thr 


Leu 


Asn 


Gin 


His 


Arg 


Trp 


He 


Ala 


Pro 




50 










55 










60 










Leu 


Val 


Val 


His 


His 


Trp 


Pro 


Asp 


Tyr 


Gin 


Val 


Arg 


Phe 


Pro 


Gin 


Arg 


65 










70 










75 










80 


Arg 


Arg 


His 


Val 


Asn 


Ser 


Gly 


Tyr 


Tyr 


Cys 


Val 


Thr 


Ser 


Arg 


His 


Phe 










85 










90 










95 




Ala 


Gly 


He 


Leu 


Arg 


Gin 


Gin 


Phe 


Gly 


Gin 


His 


Leu 


Trp 


Leu 


His 


Thr 








100 










105 










110 






Ala 


Val 


Ser 


Ala 


Val 


His 


Ala 


Glu 


Ser 


Val 


Gin 


Leu 


Ala 


Asp 


Gly 


Arg 






115 










120 










125 








He 


He 


His 


Ala 


Ser 


Thr 


Val 


He 


Asp 


Gly 


Arg 


Gly 


Tyr 


Thr 


Pro 


Asp 




130 










135 










140 










Ser 


Ala 


Leu 


Arg 


Val 


Gly 


Phe 


Gin 


Ala 


Phe 


He 


Gly 


Gin 


Glu 


Trp 


Gin 


145 










150 










155 










160 


Leu 


Ser 


Ala 


Pro 


His 


Gly 


Leu 


Ser 


Ser 


Pro 


He 


He 


Met 


Asp 


Ala 


Thr 










165 










170 










175 




Val 


Asp 


Gin 


Gin 


Asn 


Gly 


Tyr 


Arg 


Phe 


Val 


Tyr 


Thr 


Leu 


Pro 


Leu 


Ser 








180 










185 










190 






Ala 


Thr 


Ala 


Leu 


Leu 


He 


Glu 


Asp 


Thr 


His 


Tyr 


He 


Asp 


Lys 


Ala 


Asn 






195 










200 










205 








Leu 


Gin 


Ala 


Glu 


Arg 


Ala 


Arg 


Gin 


Asn 


He 


Arg 


Asp 


Tyr Ala 


Ala 


Arg 



210 215 220 
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Gin 


Gly 


Trp 


Pro 


Leu 


Gin 


Thr 


Leu 


Leu 


Arg 


Glu 


Glu 


Gin 


Gly 


Ala 


Leu 


225 






230 










235 










240 


Pro 


He 


Thr 


Leu 


Thr 


Gly 


Asp 


Asn 


Arg 


Gin 


Phe 


Trp 


Gin 


Gin 


Gin 


Pro 










245 








250 










255 




Gin 


Ala 


Cys 


Ser 


Gly 


Leu 


Arg 


Ala 


Gly 


Leu 


Phe 


His 


Pro 


Thr 


Thr 


Gly 






260 








265 










270 






Tyr 


Ser 


Leu 


Pro 


Leu 


Ala 


Val 


Ala 


Leu 


Ala 


Asp 


Arg 


Leu 


Ser 


Ala 


Leu 




275 










280 










285 








Asp 


Val 


Phe 


Thr 


Ser 


Ser 


Ser 


Val 


His 


Gin 


Thr 


He 


Ala 


His 


Phe 


Ala 


290 










295 










300 










Gin 


Gin 


Arg 


Trp 


Gin 


Gin 


Gin 


Gly 


Phe 


Phe 


Arg 


Met 


Leu 


Asn 


Arg 


Met 


305 






310 










315 










320 


Leu 


Phe 


Leu 


Ala 


Gly 
325 


Pro 


Ala 


Glu 


Ser 


Arg 
330 


Trp 


Arg 


Val 


Met 


Gin 
335 


Arg 


Phe 


Tyr 


Gly 


Leu 


Pro 


Glu 


Asp 


Leu 


He 


Ala 


Arg 


Phe 


Tyr 


Ala 


Gly 


Lys 




340 










345 










350 






Leu 


Thr 


Val 


Thr 


Asp 


Arg 


Leu 


Arg 


He 


Leu 


Ser 


Gly 


Lys 


Pro 


Pro 


Val 






355 








360 










365 








Pro 


Val 
370 


Phe 


Ala 


Ala 


Leu 


Gin 
375 


Ala 


He 


Met 


Thr 


Thr 
380 


His 


Arg 







<210> 5 

<211> 912 

<212> DNA 

<213> Pantoea stewartii 



<-400> 5 

atgatggtct 

gctgatatcg 

ggtgccgcga 

ttattaacag 

tgcgcggttg 

gatgcgcaga 

attctggcgg 

ctgacgccga 

ggtctggttc 

gccatactgc 

gcgtccattg 

gatctcggcc 

aaagacatca 

gtcgaagaac 

caaaacggcc 

gccgtcagtt 



gcgcaaaaaa 
atagccgcct 
tgcgtgaagg 
cgcgcgatct 
aaatggtgca 
tgcgtcgggg 
cggtcgcttt 
tagccaaaac 
agggccagtt 
taaccaatca 
cggccaacgc 
aggcctttca 
atcaggatgc 
gcctgcgaca 
attccaccac 
aa 



acacgttcac 
tgatcagtta 
cacgctggca 
tggctgtgcg 
tgctgcctcg 
gcgtcccacc 
actcagcaaa 
tcgcgcggtg 
taaggacctc 
gtttaaaacc 
gtcctgcgaa 
gttgcttgac 
aggtaaatca 
gcatttgcgc 
ccaacttttt 



cttactggca 
ctgccggttc 
ccgggcaaac 
atcagtcacg 
ctgattctgg 
attcacacgc 
gcgtttgggg 
tcggagctgt 
tcggaaggcg 
agcacgctgt 
gcgcgtgaga 
gatcttaccg 
acgctggtca 
ctggccagtg 
attcaggcct 



tttcggctga 
agggtgagcg 
gtattcgtcc 
ggggattact 
atgatatgcc 
agtacggtga 
tgattgccga 
ccactgcgat 
ataaaccccg 
tttgcgcgtc 
acctgcatcg 
atggcatgac 
atttattagg 
aacacctttc 
ggtttgacaa 



gcagttgctg 
ggattgtgtg 
gatgctgctg 
ggatttagcc 
ctgcatggac 
acatgtggcg 
ggctgaaggt 
tggcatgcag 
cagcgccgat 
aacgcaaatg 
tttctcgctc 
cgataccggc 
ctcaggcgcg 
cgcggcatgc 
aaaactcgct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
912 



<210> 6 
<211> 303 
<212> PRT 

<213> Pantoea stewartii 



<400> 6 




























Ala 


Met 


Met 


Val 


Cys 


Ala 


Lys 


Lys 


His 


Val 


His 


Leu 


Thr 


Gly 


He 


Ser 


1 






5 










10 










15 




Glu 


Gin 


Leu 


Leu 


Ala 


Asp 


He 


Asp 


Ser 


Arg 


Leu 


Asp 


Gin 


Leu 


Leu 


Pro 








20 










25 










30 






Val 


Gin 


Gly 


Glu 


Arg 


Asp 


Cys 


Val 


Gly 


Ala 


Ala 


Met 


Arg 


Glu 


Gly 


Thr 






35 










40 










45 








Leu 


Ala 


Pro 


Gly 


Lys 


Arg 


He 


Arg 


Pro 


Met 


Leu 


Leu 


Leu 


Leu 


Thr 


Ala 
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c r\ 

50 










C C 

55 










bU 










Ar g 


Asp 


Leu 


Gly 


Cys 


Ala 


lie 


ber 


HXS 


Gly 


Gly 


Leu 


Leu 


Asp) 


Leu 


t\X d 


OO 










"7 n 










/ 0 












Cys 


Ala 


Val 


Glu 


Met 


Val 


His 


Ala 


Ala 


ber 


Leu 


lie 


Leu 


Asp 


Asp 












85 










Q A 

y u 










o 




Pro 


Cys 


Met 


Asp 


Asp 


Ala 


Gin 


Met 


Arg 


Arg 


Gly 


Arg 


Pro 


T -r- 

i nr 


lie 


ri 1 S 








100 




















inn 
1 1 u 






Thr 


Gin 


Tyr 


Gly 


Glu 


His 


Val 


Ala 


lie 


Leu 


Ala 


Ai a 


val 


Ala 


Leu 


Le U 






115 










1 Oft 

12 0 










IOC, 
1 Z 0 








Ser 


Lys 


Ala 


Phe 


Gly 


Val 


He 


Ala 


Glu 


Ala 


Glu 


Gly 


Leu 


Thr 


Pro 


lie 




130 










135 










14 0 










Ala 


Lys 


Thr 


Arg 


Ala 


Val 


Ser 


Glu 


Leu 


Ser 


Thr 


TV T 

Ala 


He 


Gly 




tain 


14 5 








150 










ICC 

1 b 5 












Gly 


Leu 


Val 


Gin 


Gly 


Gin 


Phe 


Lys 


Asp 


Leu 


Ser 


Glu 


Gly 


ASp 


Lys 


Pro 










165 










170 










I/O 




Arg 


Ser 


Ala 


Asp 


Ala 


He 


Leu 


Leu 


Thr 


Asn 


Gin 


Pne 


Lys 


I nr 


Ser 


i nr 








18 0 










IOC 

loo 
















Leu 


Phe 


Cys 


Ala 


Ser 


Thr 


Gin 


Met 


Ala 


Ser 


He 


Ai a 


Ala 


As n 


7\ "I —> 

Ala 


ber 






195 










200 










O Pi Q 








Cys 


Glu 


Ala 


Arg 


Glu 


Asn 


Leu 


His 


Arg 


Phe 


Ser 


Leu 


Asp 


Leu 


Gly 


Gin 




210 










215 










220 










Ala 


Phe 


Gin 


Leu 


Leu 


Asp 


Asp 


Leu 


Thr 


Asp 


Gly 


Met 


Thr 


Asp 


Thr 


Gly 


225 










230 










o o c 
ZOO 










.<£ 4 U 


Lys 


Asp 


lie 


Asn 


Gin 


Asp 


Ala 


Gly 


Lys 


Ser 


Thr 


Leu 


Val 


Asn 


Leu 


Leu 








245 










250 










255 




Gly 


Ser 


Gly 


Ala 


Val 


Glu 


Glu 


Arg 


Leu 


Arg 


Gin 


His 


Leu 


Arg 


Leu 


Ala 








260 










265 










270 






Ser 


Glu 


His 


Leu 


Ser 


Ala 


Ala 


Cys 


Gin 


Asn 


Gly 


His 


Ser 


Thr 


Thr 


Gin 






275 










280 










285 








Leu 


Phe 


He 


Gin 


Ala 


Trp 


Phe 


Asp 


Lys 


Lys 


Leu 


Ala 


Ala 


Val 


Ser 






290 










295 










300 











<210> 7 
<211> 1479 
<212> DNA 

<213> Pantoea stewartii 
<400> 7 

atgaaaccaa ctacggtaat 
caggccgcag gtattcctgt 
tatgtttatc aggagcaggg 
agcgcgattg aagaactgtt 
ttgccggtca cgccgtttta 
aacgaccagg cccagttaga 
tatcgagcgt tccttgacta 
actgtgcctt ttttatcgtt 
caggcatggc gcagcgttta 
caggcgtttt cttttcactc 
tatacgctga ttcacgcgtt 
ggtgcgctgg tcaatggcat 
aacgcccggg tcagtcatat 
gacggcagac ggtttgaaac 
cgcgatctgc tgtctcagca 
cgtatgagta actcactgtt 
gcccatcata ccgtctgttt 
catgatggtc tggctgagga 
tcactggcac cggaagggtg 



tggtgcgggc tttggtggcc tggcactggc aattcgttta 60 

tttgctgctt gagcagcgcg acaagccggg tggccgggct 120 

ctttactttt gatgcaggcc ctaccgttat caccgatccc 180 

tgctctggcc ggtaaacagc ttaaggatta cgtcgagctg 240 

tcgcctgtgc tgggagtccg gcaaggtctt caattacgat 300 

agcgcagata cagcagttta atccgcgcga tgttgcgggt 360 

ttcgcgtgcc gtattcaatg agggctatct gaagctcggc 420 

caaagacatg cttcgggccg cgccccagtt ggcaaagctg 480 

cagtaaagtt gccggctaca ttgaggatga gcatcttcgg 540 

gctcttagtg ggggggaatc cgtttgcaac ctcgtccatt 600 

agaacgggaa tggggcgtct ggtttccacg cggtggaacc 660 

gatcaagctg tttcaggatc tgggcggcga agtcgtgctt 720 

ggaaaccgtt ggggacaaga ttcaggccgt gcagttggaa 780 

ctgcgcggtg gcgtcgaacg ctgatgttgt acatacctat 840 

tcccgcagcc gctaagcagg cgaaaaaact gcaatccaag 900 

tgtactctat tttggtctca accatcatca cgatcaactc 960 

tgggccacgc taccgtgaac tgattcacga aatttttaac 1020 

tttttcgctt tatttacacg caccttgtgt cacggatccg 1080 

cggcagctat tatgtgctgg cgcctgttcc acacttaggc 1140 
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acggcgaacc tcgactgggc ggtagaagga ccccgactgc gcgatcgtat ttttgactac 1200 

cttgagcaac attacatgcc tggcttgcga agccagttgg tgacgcaccg tatgtttacg 1260 

ccgttcgatt tccgcgacga gctcaatgcc tggcaaggtt cggccttctc ggttgaacct 1320 

attctgaccc agagcgcctg gttccgacca cataaccgcg ataagcacat tgataatctt 1380 

tatctggttg gcgcaggcac ccatcctggc gcgggcattc ccggcgtaat cggctcggcg 1440 

aaggcgacgg caggcttaat gctggaggac ctgatttga 1479 

<210> 8 
<211> 492 
<212> PRT 

<213> Pantoea stewartii 
<400> 8 

Met Lys Pro Thr Thr Val lie Gly Ala Gly Phe. Gly Gly Leu Ala Leu 

1 5 10 15 

Ala lie Arg Leu Gin Ala Ala Gly lie Pro Val Leu Leu Leu Glu Gin 

20 25 30 

Arg Asp Lys Pro Gly Gly Arg Ala Tyr Val Tyr Gin Glu Gin Gly Phe 

35 4 0 4 5 

Thr Phe Asp Ala Gly Pro Thr Val lie Thr Asp Pro Ser Ala lie Glu 

50 L 55 60 

Glu Leu Phe Ala Leu Ala Gly Lys Gin Leu Lys Asp Tyr Val Glu Leu 
65 70 75 80 

Leu Pro Val Thr Pro Phe Tyr Arg Leu Cys Trp Glu Ser Gly Lys Val 

85 90 95 

Phe Asn Tyr Asp Asn Asp Gin Ala Gin Leu Glu Ala Gin lie Gin Gin 

100 105 110 

Phe Asn Pro Arg Asp Val Ala Gly Tyr Arg Ala Phe Leu Asp Tyr. Ser 

115 " 120 125 

Arg Ala Val Phe Asn Glu Gly Tyr Leu Lys Leu Gly Thr Val Pro Phe 

130 135 140 

Leu Ser Phe Lys Asp Met Leu Arg Ala Ala Pro Gin Leu Ala Lys Leu 
145 J 150 155 160 

Gin Ala Trp Arg Ser Val Tyr Ser Lys Val Ala Gly Tyr lie Glu Asp 

165 170 175 

Glu His Leu Arg Gin Ala Phe Ser Phe His Ser Leu Leu Val Gly Gly 

180 185 190 

Asn Pro Phe Ala Thr Ser Ser lie Tyr Thr Leu lie His Ala Leu Glu 

195 200 205 

Arg Glu Trp Gly Val Trp Phe Pro Arg Gly Gly Thr Gly Ala Leu Val 

210 * J 215 220 

Asn Gly Met lie Lys Leu Phe Gin Asp Leu Gly Gly Glu Val Val Leu 
225 230 235 240 

Asn Ala Arg Val Ser His Met Glu Thr Val Gly Asp Lys lie Gin Ala 

245 250 255 

Val Gin Leu Glu Aso Gly Arg Arg Phe Glu Thr Cys Ala Val Ala Ser 

260 265 270 

Asn Ala Aso Val Val His Thr Tyr Arg Asp Leu Leu Ser Gin His Pro 

275 280 285 

Ala Ala Ala Lys Gin Ala Lys Lys Leu Gin Ser Lys Arg Met Ser Asn 

290 " 295 300 

Ser Leu Phe Val Leu Tyr Phe Gly Leu Asn His His His Asp Gin Leu 
305 310 315 320 

Ala His His Thr Val Cys Phe Gly Pro Arg Tyr Arg Glu Leu lie His 

325 330 335 

Glu lie Phe Asn His Asp Gly Leu Ala Glu Asp Phe Ser Leu Tyr Leu 
340 345 350 
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His 


Ala 


Pro 


Cy s 


v a x 


i nr 


Asp 


U i- \J 




Leu 


Ala 


Pro 


Glu 


Glv 


Cys 


Gly 






ODD 








360 










365 








Ser 


Tyr 


Tyr 


Va 1 


Leu 


A±a 


Pro 


v a -L 


IT X. w 


nis 


T oil 


\j j- y 


Thr 


Ala 


Asn 


Le u 




"5 *7 A 

3 7 U 








•37c. 

D I D 










380 










Asp 


Trp 


Ala 


Val 


Lj±U 


Lj± y 


Pro 


7\ r-n 


LSU 




Z\ en 


Arg 


He 


Phe 


Asp 


T vr 


385 








^> -? U 










395 










400 


Leu 


Glu 


Gin 


HlS 


Tyr 


iyje l 


Pro 


yiy 




Arg 


Ser 


Gin 


Leu 


Val 


Thr 


His 




















4 10 










415 




Arg 


Met 


O K a 

rne 


i nr 


Pro 


IT IlC 




Phe 


Arg 


Asp 


Glu 


Leu 


Asn 


Ala 


Trp 


Gin 






/ion 










4 25 

Tl <C J 










430 






Gly 


Ser 


Ala 


O W /-s 


o e r 


Val 


blU 


P r n> 
IT I. vj 


Tip 

lie 


J_»C- u 


Thr 


Gin 


Ser 


Ala 


TrD 


Phe 






435 










440 










445 








Arg 


Pro 


His 


Asn 


Arg 


Asp 


Lys 


His 


He 


Asp 


Asn 


Leu 


Tyr 


Leu 


Val 


Gly 


450 










455 










4 60 










Ala 


Gly 


Thr 


His 


Pro 


Gly 


Ala 


Gly 


He 


Pro 


Gly 


Val 


He 


Gly 


Ser 


Ala 


465 








470 










475 










480 


Lys 


Ala 


Thr 


Ala 


Gly 


Leu 


Met 


Leu 


Glu 


Asp 


Leu 


He 
















485 










4 90 















<210> 9 
<211> 893 
<212> DNA 

<213> Pantoea stewartii 



<400> 9 

ccatggcggt 

gtcgcagcgt 

aaacactggg 

agcttgaaat 

ccgcgtttca 

tggaaggttt 

gttattgeta 

gegataaege 

ttgcgcgtga 

tggaagagga 

gccgtatcgc 

gt ctggcaca 

gtaaaattgg 

cgtccaccgc 

ggatgaagac 



tggctcgaaa 
getgatgett 
ct ttcatgee 
gaaaacgcgt 
ggaggtcgcg 
tgccatggat 
tcacgtcgcc 
cacgctcgat 
tattgtcgac 
aggactgacg 
egggegactg 
attaccctta 
cgtgaaagtt 
cgaaaaatta 
gtatccaccc 



agetttgega 
tacgcatggt 
gaccagccct 
caggcctacg 
atggcgcatg 
gtgcgcgaaa 
ggtgttgtgg 
cgcgcctgcg 
gatgetcagg 
aaagegaatt 
gtacgggaag 
cgctcggcct 
gaacaggccg 
aegcttttge 
cgtcctgctc 



ctgcatcgac 
gccgccactg 
ettegcagat 
ccggttcgca 
atatcgctcc 
cgcgctacct 
gectgatgat 
ateteggget 
tgggccgctg 
atgctgcgcc 
cggaacccta 
gggccatcgc 
gtaagcaggc 
tgaeggcate 
atctctggca 



gcttttcgac 
cgacgacgtc 
gectgagcag 
aatgeacgag 
cgcctacgcg 
gacactggac 
ggegcaaatt 
ggctttccag 
ttatctgect 
agaaaacegg 
ttaegtatea 
gaeagegaag 
ctgggatcat 
eggtcaggea 
gcgcccgatc 



gccaaaaccc 
attgacgatc 
cgcctgcagc 
cccgct t ttg 
ttcgaccatc 
gataegctge 
atgggcgttc 
ttgaccaaca 
gaaagctggc 
caggect taa 
tcaatggccg 
caggtgtacc 
cgccagtcca 
gttacttccc 
tag 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
893 



<210> 10 
<211> 296 
<212> PRT 

<213> Pantoea. stewartii 



<400> 10 












Met 


Ala 


Val 


Gly 


Ser 


Lys 


Ser 


Phe 


1 








5 








Ala 


Lys 


Thr 


Arg 


Arg 


Ser 


Val 


Leu 








20 










Cys 


Asp 


Asp 


Val 


He 


Asp 


Asp 


Gin 






35 










40 


Pro 


Ser 


Ser 


Gin 


Met 


Pro 


Glu 


Gin 




50 










55 




Thr 


Arg 


Gin 


Ala 


Tyr 


Ala 


Gly 


Ser 


65 










70 







Ala Thr Ala Ser Thr Leu Phe Asp 

10 15 
Met Leu Tyr Ala Trp Cys Arg His 
25 30 
Thr Leu Gly Phe His Ala Asp Gin 
45 

Arg Leu Gin Gin Leu Glu Met Lys 
60 

Gin Met His Glu Pro Ala Phe Ala 
75 80 
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Ala Phe 

Phe Asp 

Leu Thr 

Val Gly 
130 
Leu Asp 
145 

Ala Arg 

Glu Ser 

Pro Glu 

Glu Ala 
210 
Pro Leu 
225 

Lys He 

Arg Gin 

Ser Gly 

Ala His 
290 



Gin 

His 

Leu 
115 
Leu 

Arg 

Asp 

Trp 

Asn 
195 
Glu 

Arg 

Gly 

Ser 

Gin 
275 
Leu 



Glu 

Leu 
100 
Asp 

Met 

Ala 

He 

Leu 
180 
Arg 

Pro 

Ser 

Val 

Thr 
260 
Ala 

Trp 



Val Ala 
85 

Glu Gly 

Asp Thr 

Met Ala 

Cys Asp 
150 
Val Asp 
165 

Glu Glu 

Gin Ala 

Tyr Tyr 

Ala Trp 
230 
Lys Val 
245 

Ser Thr 
Val Thr 
Gin Arg 



Met Ala 

Phe Ala 

Leu Arg 
120 
Gin He 
135 

Leu Gly 

Asp Ala 

Glu Gly 

Leu Ser 
200 
Val Ser 
215 

Ala He 

Glu Gin 

Ala Glu 

Ser Arg 
280 
Pro He 
295 



His Asp 

90 
Met Asp 
105 

Tyr Cys 

Met Gly 

Leu Ala 

Gin Val 
170 
Leu Thr 
185 

Arg He 

Ser Met 

Ala Thr 

Ala Gly 
250 
Lys Leu 
265 

Met Lys 



He Ala Pro Ala 
Val Arg 
Tyr His 



Val Arg 
140 
Phe Gin 
155 

Gly Arg 

Lys Ala 

Ala Gly 

Ala Gly 
220 
Ala Lys 
235 

Lys Gin 
Thr Leu 
Thr Tyr 



Glu Thr 
110 
Val Ala 
125 

Asp Asn 



Leu Thr 

Cys Tyr 

Asn Tyr 
190 
Arg Leu 
205 

Leu Ala 

Gin Val 

Ala Trp 

Leu Leu 
270 
Pro Pro 
285 



Tyr Ala 
95 

Arg Tyr 

Gly Val 

Ala Thr 

Asn lie 
160 
Leu Pro 
175 

Ala Ala 

Val Arg 

Gin Leu 

Tyr Arg 
240 
Asp His 
255 

Thr Ala 
Arg Pro 



<210> 11 
<211> 528 
<212> DNA 

<213> Pantoea stewartii 



<400> 11 

atgttgtgga 

gctgcactgg 

catgaaccgc 

gtgtcgattg 

gcaggcat.ga 

cgctggccgt 

cgtatgcatc 

ccaccgttat 

gccagagatg 



tttggaatgc 
cacataaata 
gtaaaggcgc 
ccctgattta 
ccgcttatgg 
tccgctacat 
atgctgtaag 
ctaaacttca 
agcaggacgg 



cctgatcgtg 
catcatgcac 
atttgaagtt 
cttcggcagt 
tttactgtat 
accgcgcaaa 
gggaaaagag 
ggcgacgctg 
ggtggatacg 



tttgtcaccg 
ggctggggtt 
aacgatctct 
acaggaatct 
tttatggtcc 
ggctacctga 
ggctgcgtgt 
agagaaaggc 
tcttcatccg 



tggtcggcat 
ggggctggca 
atgccgtggt 
ggccgctcca 
acgacggact 
aacggttata 
cctttggttt 
atgcggctag 
ggaagtaa 



ggaagtggtt 
tctttcacat 
attcgccatt 
gtggattggt 
ggtacaccag 
catggcccac 
tctgtacgcg 
atcgggcgct 



<210> 12 
<211> 175 
<212> PRT 

<213> Pantoea stewartii 
<400> 12 

Met Leu Trp He Trp Asn Ala Leu He Val Phe Val Thr Val Val Gly 

15 10 15 

Met Glu Val Val Ala Ala Leu Ala His Lys Tyr He Met His Gly Trp 

20 25 30 

Gly Trp Gly Trp His Leu Ser His His Glu Pro Arg Lys Gly Ala Phe 
35 4 0 4 5 



60 
120 
180 
240 
300 
360 
420 
480 
528 
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Glu 


Val 


Asn 


Asp 


Leu 


Tyr 


Ala 


Val 


Val 


Phe 


Ala 


lie 


Val 


Ser 


lie 


Ala 




3(J 










c c 
03 










o U 










Leu 


lie 


Tyr 


Phe 


Gly 


Ser 


Thr 


Gly 


lie 


Trp 


Pro 


Leu 


Gin 


Trp 


i i e 


Cjly 


c c 
DO 










70 










/ 3 










o U 


Ala 


Gly 


Met 


Thr 


Ala 


Tyr 


Gly 


Leu 


Leu 


Tyr 


Phe 


Met 


Val 


His 


Asp 


Gly 










85 










90 










y 3 




Leu 


Val 


His 


Gin 


Arg 


Trp 


Pro 


Phe 


Arg 


Tyr 


lie 


Pro 


Arg 


Lys 


Gly 


Tyr 








1UU 










inc. 










1 1 A 
ilU 






Leu 


Lys 


Arg 


Leu 


Tyr 


Met 


Ala 


His 


Arg 


Met 


His 


His 


Ala 


Val 


Arg 


Gly 






115 










120 










125 








Lys 


Glu 


Gly 


Cys 


Val 


Ser 


Phe 


Gly 


Phe 


Leu 


Tyr 


Ala 


Pro 


Pro 


Leu 


Ser 




130 










135 










140 










Lys 


Leu 


Gin 


Ala 


Thr 


Leu 


Arg 


Glu 


Arg 


His 


Ala 


Ala 


Arg 


Ser 


Gly 


Ala 


145 










150 










155 










160 


Ala 


Arg 


Asp 


Glu 


Gin 


Asp 


Gly 


Val 


Asp 


Thr 


Ser 


Ser 


Ser 


Gly 


Lys 





165 170 175 



<210> 13 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 13 

atyatgcacg gctggggwtg gsgmtggca 29 

<210> 14 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 14 

ggccarcgyt gatgcaccag mccgtcrtgc a 31 

<210> 15 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 15 

ctgatgctct aygcctggtg ccgcca 26 

<210> 16 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
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<400> 16 

tcgcgrgcra trttsgtcar ctg 

<210> 17 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<4O0> 17 

atbmtsatgg aygcsacsgt 

<210> 18 
<211> 20 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 18 

ytratcgarg ayacgcrcta 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 19 

rsggcagyga atagccrgtg 

<210> 20 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 20 

aacagcatsc grttcagcak gcgsa 

<210> 21 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 21 



awcrvv.irv -wn rr^mc-jocao i „ 
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ccgacggtka tcaccgatcc 20 

<210> 22 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 22 

ctgcgccsac caggtagag 19 

<210> 23 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 23 

ctygacgaya tgccctgcat ggac 24 

<210> 24 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 24 

gtcgatttwc csgcgtcctk attg 24 

<210> 25 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 25 

ggccgaattc caacgatgct ctggcagtta 30 

<210> 26 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 26 

ggccagatct acttcaggcg acgctgagag 30 
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<210> 27 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 27 

ggccagatct tacgcgcggg taaagccaat 30 

<210> 28 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 28 

ggcctctaga attaccgcgt ggttctgaag 30 

<210> 29 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
. <400> 29 . 

ggcctctaga tctgtacgcg ccaccgttat . 30 

<210> 30 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> ' Primer 
<400> 30 

catcggtaag atcgtcaagc aactgaa 27 

<210> 31 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 31 

gatttacctg catcctgatt gatgtct 27 

<210> 32 
<21i> 27 



WO 02/079395 



13/17 



PCT/U$02/02l24 



<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 32 27 
atgtataacc gtttcaggta gcctttg 

<210> 33 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 33 27 
aatacagtaa accataagcg gtcatgc 

<210> 34 
<211> 18 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 



<400> 34 

ttcatcatcg cgcatgac 

<210> 35 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Primer 



18 



<400> 35 18 
agrtgrtgyt cgtgrtga 

<210> 36 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 36 21 
gcggcatagg ctagattgaa g 

<210> 37 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer 



<400> 37 

gcgagttcct tctcacctat 

<210> 38 

<211> 735 

<212> DNA 

<213> Brevundimonas aurantiaca 



20 



<400> 38 

atgaccgccg 

ctggcgggaa 

cgatgggggc 

tcggtcggcc 

ccgcggctga 

gatcggctga 

gattttcacg 

tatttcggct 

ctgggggcgc 

cttcagctct 

gacgcgcacc 

cacttcggcc 

cgcggcgagt 



ccgtcgccga 
tgatcgtggc 
cgttgaccct 
ttttcatcgt 
acgccgcagt 
agacggcgca 
ccccggcgcc 
ggcgcgagat 
ggccggccaa 
tcaccttcgg 
acgcccgcag 
gccaccacga 
cttga 



gccacgcacc 
gggatgggcg 
ggtgatcgcc 
cgcccatgac 
cggccggctg 
ccacgcccac 
ccgcgccttc 
ggcggtcctg 
tctcctgacc 
cacctggctg 
cagcggctac 
acaccatctg 



gtcccgcgcc 
gttctgcatg 
ccggcgatcg 
gccatgtacg 
accctggggc 
cacgccgcgc 
cttccctggt 
accgccctgg 
ttctgggccg 
ccgcaccgcc 
ggccccgtgc 
agcccctggc 



agacctggat 
tctacggcgt 
tggcggtcca 
gctccctggc 
tctatgcggg 
ccggcacggc 
tcctgaactt 
tcctgatcgc 
cgccggccct 
acaccgacca 
tttccctgct 
ggccctggtg 



cggt ctgacc 
ctattttcac 
gacctggttg 
gccgggacgg 
cttccgcttc 
cgacgacccg 
ctttcgcacc 
cctcttcggc 
gctttcagcg 
gccgttcgcc 
cacctgtttc 
gcgtctgtgg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
735 



<210> 39 

<211> 244 

<212> PRT 

<213> Brevundimonas aurantiaca 



<400> 39 




























Met 


Thr 


Ala 


Ala 


Val 


Ala 


Glu 


Pro 


Arg 


Thr 


Val 


Pro 


Arg 


Gin 


Thr 


Trp 


1 








5 










10 










15 




lie 


Gly 


Leu 


Thr 


Leu 


Ala 


Gly 


Met 


He 


Val 


Ala 


Gly 


Trp 


Ala 


Val 


Leu 






20 










25 










30 






His 


Val 


Tyr 


Gly 


Val 


Tyr 


Phe 


His 


Arg 


Trp 


Gly 


Pro 


Leu 


Thr 


Leu 


Val 






35 








40 










45 








He 


Ala 
50 


Pro 


Ala 


He 


Val 


Ala 
55 


Val 


Gin 


Thr 


Trp 


Leu 
60 


Ser 


Val 


Gly 


Leu 


Phe 


He 


Val 


Ala 


His 


Asp 


Ala 


Met 


Tyr 


Gly 


Ser 


Leu 


Ala 


Pro 


Gly 


Arg 


65 










70 










75 










80 


Pro 


Arg 


Leu 


Asn 


Ala 


Ala 


Val 


Gly 


Arg 


Leu 


Thr 


Leu 


Gly 


Leu 


Tyr 


Ala 








85 










90 










95 




Gly 


Phe 


Arg 


Phe 


Asp 


Arg 


Leu 


Lys 


Thr 


Ala 


His 


His 


Ala 


His 


His 


Ala 




100 










105 










110 






Ala 


Pro 


Gly 
115 


Thr 


Ala 


Asp 


Asp 


Pro 
120 


Asp 


Phe 


His 


Ala 


Pro 
125 


Ala 


Pro 


Arg 


Ala 


Phe 


Leu 


Pro 


Trp 


Phe 


Leu 


Asn 


Phe 


Phe 


Arg 


Thr 


Tyr 


Phe 


Gly 


Trp 




130 








135 










140 










Arg 


Glu 


Met 


Ala 


Val 


Leu 


Thr 


Ala 


Leu 


Val 


Leu 


lie 


Ala 


Leu 


Phe 


Gly 


145 










150 










155 










160 


Leu 


Gly 


Ala 


Arg 


Pro 


Ala 


Asn 


Leu 


Leu 


Thr 


Phe 


Trp 


Ala 


Ala 


Pro 


Ala 






165 










170 










175 




Leu 


Leu 


Ser 


Ala 
180 


Leu 


Gin 


Leu 


•Phe 


Thr 
185 


Phe 


Gly 


Thr 


Trp 


Leu 
190 


Pro 


His 
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Arg His Thr Asp Gin 
195 

Gly Tyr Gly Pro Val 
210 

His His Glu His His 
225 

Arg Gly Glu Ser 



Pro Phe Ala Asp Ala His 
200 

Leu Ser Leu Leu Thr Cys 
215 

Leu Ser Pro Trp Arg Pro 

230 235 



His Ala Arg Ser Ser 
205 

Phe His Phe Giy Arg 
220 

Trp Trp Arg Leu Trp 
240 



<210> 40 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 40 , 

ccaygaygay atwatgga 

<210> 41 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 41 

yttyttvccy tycctaat 

<210> 42 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 42 

acagcgttgg acactcag 

<210> 43 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 43 

gcgtcgataa tggaagtgag 

<210> 44 
<211> 1496 
<212> DNA 

<213> Sulfolobus shibatae 
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Met 


Ser 


Asp 


Glu 


Leu 


Ser 


Ser 


Tyr 


Phe 


Asn 


Asp 


lie 


Val 


Asn 


Asn 


Val 


1 






5 










10 










15 




Asn 


Phe 


His 


He 


Lys 


Asn 


Phe - 


Val 


Lys 


Ser 


Asn 


Val 


Arg 


Thr 


Leu 


Glu 








20 








25 










30 






Glu 


Ala 


Ser 


Phe 


His 


Leu 


Phe 


Thr 


Ala 


Gly 


Gly 


Lys 


Arg 


Leu 


Arg 


Pro 






35 










40 










45 








Leu 


He 


Leu 


Val 


Ser 


Ser 


Ser 


Asp 


Leu 


He 


Gly 


Gly 


Asp 


Arg 


Gin 


Arg 




50 










55 










60 










Ala 


Tyr 


Lys 


Ala 


Ala . 


Ala 


Ala 


Val 


Glu 


He 


Leu 


His 


Asn 


Phe 


Thr 


Leu 


65 






70 










75 










80 


Val 


His 


Asp 


Asp 


He 


Met 


Asp 


Arg 


Asp 


Tyr 


Leu 


Arg 


Arg 


Gly 


Leu 


Pro 






85 










90 










95 




Thr 


Val 


His 


Val 


Lys 


Trp 


Gly 


Glu 


Pro 


Met 


Ala 


He 


Leu 


Ala 


Gly 


Asp 








100 








105 










110 






Tyr 


Leu 


His 


Ala 


Lys 


Ala 


Phe 


Glu 


Ala 


Leu 


Asn 


Glu 


Ala 


Leu 


Lys 


Gly 




115 










120 










125 








Leu 


Asp 


Gly 


Asn 


Thr 


Phe 


Tyr 


Lys 


Ala 


Phe 


Ser 


Val 


Phe 


He 


Asn 


Ser 




130 








135 










140 










lie 


Glu 


He 


He 


Ser 


Glu 


Gly 


Gin 


Ala 


Met 


ASD 


Met 


Ser 


Phe 


Glu 


Asn 


145 










150 








155 










160 


Arg 


Val 


Asp 


Val 


Thr 


Glu 


Glu 


Glu 


Tyr 


Met 


Gin 


Met 


He 


Lys 


Gly 


Lys 






.165 










170 










175 




Thr 


Ala 


Met 


Leu 


Phe 


Ser 


Cys 


Ser 


Ala 


Ala 


Leu 


Gly 


Gly 


He 


He 


Asn 



60 



180 
240 
300 
3 60 
420 
480 



<400> 44 

ttaccagtgt taaaaagtgc tatagaaggt aaggaaagtt tagaacaatt ctttagaaag 

ataatatttg aattgaaggc cgccatgatg cttactggtt ctaaagacgt tgatgcgtta 120 

aagaagacca gtattgttat tttaggtaaa cttaaagagt gggcagaata tagggggata 

aatttatcta tatacgagaa agttagaaag agagaataaa atgagtgacg aattaagttc 

gtattttaat gatatagtta acaatgtaaa ttttcatata aaaaattttg taaagagcaa 

tgttagaacg cttgaggaag catcgtttca tttatttaca gctgggggca aaagacttag 

acccttaatt ctggtttcat cgtcagactt aattggcggg gacaggcaaa gggcatataa 

ggcagcagct gccgtggaga ttcttcacaa ctttactcta gttcatgacg atataatgga 

tagggattac ctaagaagag gattaccaac tgttcatgta aagtggggtg aaccaatggc 540 

660 
720 
780 
840 
900 
960 
1020 
1080 



aatacttgca ggtgattact tacacgccaa ggcttttgaa gccttaaatg aggctctaaa 
aggtcttgac gggaatacgt tttataaggc tttttccgta tttattaatt ctattgagat 
aatatcggaa ggtcaagcaa tggatatgtc atttgaaaat agagtagatg taactgagga 
agagtacatg caaatgataa aaggaaagac tgcgatgcta ttttcatgtt ctgctgcatt 
aggcggtata attaacaagg ctagcgatga tataattaaa aatttagtcg aatatggatt 
aaatctaggc atatcattcc aaatagtgga tgatatctta ggaattattg gagaccaaaa 
ggaattaggg aaaccagttt acagtgatat tagggaaggt aagaaaacaa ttcttgttat 
aaaaacttta agtgaagcta ctgacgatga aaagaaaatt ctggtttcta cgcttgggaa 
tagggaggct aaaaaggacg atcttgagag agcgtcggaa ataataagga agtattcatt 
gcaatatgca tacaatttag ctaaaaagta ctcagatctt gcattagaac atttgcgtaa 1140 
aattccagtt tacaatgaaa ctgctgaaaa ggctttaaaa tatctagcgc agtttaccat 1200 
tgaaaggaga aagtaaatga gcatatcagg gatattgctt tcaattttta tatccttttt 1260 
cataagctat attacaacag tctgggtaat aagacaggca aaaaagagtg ggcttgtagg 1320 
taaggatgta aataaaccag ataaaccgga aataccacta atgggtggga taagtataat 1380 
agccgggttt atagcgggat ccttctcctt attactaact gatgtaagaa gtgagcgagt 1440 
aattccatct gtaatactct cctcattgct tatagcattt cttggactat tagatg 1496 

<210> 45 

<211> 331- 

<212> PRT 

<213> Sulfolobus shibatae. 

<400> 45 
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180 




Lys 


Ala 


Ser 


Asp 


Asp 






195 






Leu 


Gly 


He 


Ser 


Phe 




210 








Asp 


Gin 


Lys 


Glu 


Leu 


225 










Lys 


Lys 


Thr 


He 


Leu 










245 


Glu 


Lys 


Lys 


He 


Leu 








260 




Asp 


Asp 


Leu 


Glu 


Arg 






275 






Tyr 


Ala 


Tyr 


Asn 


Leu 




290 








Leu 


Arg 


Lys 


He 


Pro 


305 










Tyr 


Leu 


Ala 


Gin 


Phe 








325 



185 

He lie Lys Asn Leu Val 
200 

Gin lie Val Asp Asp lie 
215 

Gly Lys Pro Val Tyr Ser 
230 ~ 235 
Val He Lys Thr Leu Ser 
250 

Val Ser Thr Leu Gly Asn 
265 

Ala Ser Glu lie lie Arg 
280 

Ala Lys Lys Tyr Ser Asp 
295 

Val Tyr Asn Glu Thr Ala 
310 315 
Thr He Glu Arg Arg Lys 
330 



190 

Glu Tyr Gly Leu Asn 
205 

Leu Gly lie lie Gly 
220 

Asp He Arg Glu Gly 
240 

Glu Ala Thr Asp Asp 
255 

Arg Glu Ala Lys Lys 
270 

Lys Tyr Ser Leu Gin 
285 

Leu Ala Leu Glu His 
300 

Glu. Lys Ala Leu Lys 
320 



<210> 46 

<211> 20 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Exemplary motif 
<400> 46 

aggtcgtgta ctgtcagtca 

<210> 47 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Exemplary motif 
<400> 47 

acgtggtgaa ctgccagtga 



18 
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Carotenoid Biosynthesis 

TECHNICAL FIELD 

The invention relates to methods and materials for producing carotenoids, and in 
particular, to nucleic acid molecules, polypeptides, host cells, and methods that can be 
used for producing carotenoids. 

5 BACKGROUND 

Astaxanthin (3,3'-dihydroxy-(3,p-carotene-4,4'-dione) is the primary carotenoid 
that imparts the pink pigment to the eggs, flesh, and skin of salmon, trout, and shrimp. 
Most animals cannot synthesize carotenoids. Rather, the pigments are acquired through 
the food chain from marine algae and phytoplankton, the primary producers of 

10 astaxanthin. ATX exists in three configurational isomers [(3S, 3'S), (3R, 3'R) and (3S, 
3'R; 3R, 3'S)], however, ATX is found in the marine environment only in the (3S, 3'S) 
form. Consequently, this form is considered the natural and most desirable form of ATX. 

Although astaxanthin has been commercially extracted from some yeast and 
Crustacea species and has been chemically synthesized as a 1:2:1 mixture of the (3S,3'S)-, 

15 (3S,3'R)- and (3R,3'R)-isomers, astaxanthin is limited in availability and is expensive to 
purchase. See, Torrisen et al. (1989) Crit. Rev. Aquatic Sci. 1:209; and Mayer (1994) 
Pure Appl. Chem. , 66:93 1-938. Thus, there is a need for a less expensive source of the 
naturally-occurring (3S,3'S) astaxanthin. 

SUMMARY 

20 The invention is based on methods and materials for producing carotenoids such 

as lycopene, zeaxanthin, zeaxanthin diglucoside, canthaxanthin, p-carotene, lutein, and 
astaxanthin. Such carotenoids can be used as nutritional supplements in humans and can 
be formulated for use in aquaculture or as an animal feed. The invention provides nucleic 
acid molecules that can be used to engineer host cells having the ability to produce 

25 particular carotenoids and polypeptides that can be used in cell-free systems to make 
particular carotenoids. The engineered cells described herein can be used to produce 
large quantities of carotenoids. 
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In one aspect, the invention features an isolated nucleic acid having at least 76% 
sequence identity to the nucleotide sequence of SEQ ID NO:l (e.g., at least 80%, 85%, 
90%, or 95% sequence identity to the nucleotide sequence of SEQ ID NO: 1) or to a 
fragment of SEQ ID NO:l at least 33 contiguous nucleotides in length. An isolated 
nucleic acid can encode a zeaxanthin glucosyl transferase polypeptide at least 75% 
identical to the amino acid sequence of SEQ ID NO:2. Expression vectors containing 
such nucleic acids operably linked to an expression control element also are featured. 

In another aspect, the invention features an isolated nucleic acid having at least 
78% sequence identity to the nucleotide sequence of SEQ ID NO:3 (e.g., at least 80%, 
85%, 90%, or 95% sequence identity to the nucleotide sequence of SEQ ID NO:3) or to a 
fragment of SEQ ID NO:3 at least 32 contiguous nucleotides in length. An isolated 
nucleic acid can encode a lycopene (3-cyclase polypeptide at least 83% identical to the 
amino acid sequence of SEQ ID NO:4. p-carotene can be made by contacting lycopene 
with a polypeptide encoded by such isolated nucleic acids. The invention also features an 
expression vector that includes such nucleic acids operably linked to an expression 
control element. 

In yet another aspect, the invention features an isolated nucleic acid having at least 
81% sequence identity to the nucleotide sequence of SEQ ID NO:5 (e.g., at least 85%, 
90%, or 95% sequence identity to the nucleotide sequence of SEQ ID NO: 5) or to a 
fragment of SEQ ID NO: 5 at least 60 contiguous nucleotides in length. An isolated 
nucleic acid also can encode a geranylgeranyl pyrophosphate synthase polypeptide at 
least 85% identical to the amino acid sequence of SEQ ID NO:6. Geranylgeranyl 
pyrophosphate can be made by contacting farnesyl pyrophosphate and isopentenyl 
pyrophosphate with a polypeptide encoded by such nucleic acids. Expression vectors that 
include such nucleic acids operably linked to an expression control element also are 
featured. 

Isolated nucleic acids having at least 82% sequence identity to the nucleotide 
sequence of SEQ ID NO:7 (e.g., at least 85%, 90%, or 95% sequence identity to the 
nucleotide sequence of SEQ ID NO:7) or to a fragment of SEQ ID NO:7 at least 30 
contiguous nucleotides in length also are featured. An isolated nucleic acid also can 
encode a phytoene desaturase polypeptide at least 90% identical to the amino acid 
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sequence of SEQ ID NO: 8. Lycopene can be made by contacting phytoene with a 
polypeptide encoded by such nucleic acids. An expression vector that includes such 
nucleic acids operably linked to an expression control element also is featured. 

The invention also features an isolated nucleic acid having at least 82% sequence 

5 identity to the nucleotide sequence of SEQ ID NO:9 (e.g., at least 85%, 90%, or 95% 

sequence identity to the nucleotide sequence of SEQ ID NO:9) or to a fragment of SEQ 
ID NO:9 at least 23 contiguous nucleotides in length. An isolated nucleic acid also can 
encode a phytoene synthase polypeptide at least 89% identical to the amino acid sequence 
of SEQ ID NO: 10. Phytoene can be made by contacting geranylgeranyl pyrophosphate 

10 with a polypeptide encoded by such nucleic acids. An expression vector that includes 
such nucleic acids operably linked to an expression control element also is featured. 

In yet another aspect, the invention features an isolated nucleic acid having at least 
85% sequence identity to the nucleotide sequence of SEQ ID NO: 11 (e.g., at least 90% or 
95% identity to the nucleotide sequence of SEQ ID NO: 1 1) or to a fragment of SEQ ID 

15 NO: 1 1 at least 36 contiguous nucleotides in length. An isolated nucleic acid can encode a 
P-carotene hydroxylase polypeptide at least 90% identical to the amino acid sequence of 
SEQ ID NO: 12. Zeaxanthin can be made by contacting p-carotene with a polypeptide 
encoded by such nucleic acids. Astaxanthin can be made by contacting canthaxanthin 
with a polypeptide encoded by such nucleic acids. The invention also features an 

20 expression vector that includes such nucleic acids operably linked to an expression 
control element. 

The invention also features membranous bacteria (e.g., a Rhodobacter species) 
that include at least one exogenous nucleic acid encoding phytoene desaturase, lycopene 
P-cyclase, P-carotene hydroxylase, and p-carotene C4 oxygenase, wherein expression of 

25 the at least one exogenous nucleic acid produces detectable amounts of astaxanthin in the 
membranous bacteria. The amino acid sequence of the phytoene desaturase can be at 
least 90%) identical to the amino acid sequence of SEQ ID NO:8. The amino acid 
sequence of the lycopene P-cyclase can be at least 83%) identical to the amino acid 
sequence of SEQ ID NO:4. The amino acid sequence of the p-carotene hydroxylase can 

30 be at least 90% identical to the amino acid sequence of SEQ ID NO: 12. The amino acid 
sequence of the P-carotene C4 oxygenase can be at least 80% identical to the amino acid 
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sequence of SEQ ID NO:39. The membranous bacteria further can include an exogenous 
nucleic acid encoding geranylgeranyl pyrophosphate synthase (e.g., a multifunctional 
geranylgeranyl pyrophosphate synthase) or can lack endogenous bacteriochlorophyll 
biosynthesis. The multifunctional geranylgeranyl pyrophosphate synthase can have an 
amino acid sequence at least 90% identical to the amino acid sequence of SEQ ID NO:45. 
The membranous bacteria further can include an exogenous nucleic acid encoding 
phytoene synthase. The phytoene synthase can have an amino acid sequence at least 89% 
identical to the amino acid sequence of SEQ ID NO: 10. 

In another aspect, the invention features membranous bacteria that include an 
exogenous nucleic acid encoding a phytoene desaturase having an amino acid sequence at 
least 90% identical to the amino acid sequence of SEQ ID NO: 8, and wherein the 
membranous bacteria produces detectable amounts of lycopene. The membranous 
bacteria further can include a lycopene P-cyclase, wherein the membranous bacteria 
produce detectable amounts of p-carotene. The membranous bacteria also can include a 
P-carotene hydroxylase, wherein the membranous bacteria produce detectable amounts of 
zeaxanthin. 

In still yet another aspect, the invention feature membranous bacteria that include 
at least one exogenous nucleic acid encoding phytoene desaturase, lycopene P-cyclase, 
and P-carotene C4 oxygenase, wherein expression of the at least one exogenous nucleic 
acid produces detectable amounts of canthaxanthin in the membranous bacteria. The 
membranous bacteria also can include a p-carotene hydroxylase, wherein the 
membranous bacteria produce detectable amounts of astaxanthin. 

The invention also features a composition that includes an engineered 
Rhodobacter cell, wherein the cell produces a detectable amount of astaxanthin or 
canthaxanthin. The engineered Rhodobacter cell can include at least one exogenous 
nucleic acid encoding phytoene desaturase, lycopene p-cyclase, P-carotene hydroxylase, 
and P-carotene C4 oxygenase. The composition can be formulated for aquaculture and 
can pigment the flesh of fish or the carapace of crustaceans after ingestion. The 
composition can be formulated for human consumption or as an animal feed (e.g., 
formulated for consumption by chickens, turkeys, cattle, swine, or sheep). 
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The invention also features a method of making a nutraceutical. The method 
includes extracting carotenoids from an engineered Rhodobacter cell, the engineered 
Rhodobacter cell including at least one exogenous nucleic acid encoding phytoene 
desaturase, lycopene p-cyclase, p-carotene hydroxylase, and p-carotene C4 oxygenase, 
and wherein the Rhodobacter cell produces detectable amounts of astaxanthin. 

In yet another aspect, the invention features membranous bacteria, wherein the 
membranous bacteria include an exogenous nucleic acid encoding a lycopene P-cyclase 
having an amino acid sequence at least 83% identical to the amino acid sequence of SEQ 
ID NO:4. The membranous bacteria further can include a phytoene desaturase, (e.g., an 
exogenous phytoene desaturase), wherein the membranous bacteria produce detectable 
amounts of p-carotene. The membranous bacteria also can include a p-carotene 
hydroxylase (e.g., an exogenous P-carotene hydroxylase), wherein the bacteria produce 
detectable amounts of zeaxanthin. 

Membranous bacteria that include a p-carotene hydroxylase having an amino acid 
sequence at least 90% identical to the amino acid sequence of SEQ ID NO: 12 also is 
featured. The membranous bacteria further can include a lycopene P-cyclase (e.g., an 
exogenous lycopene P-cyclase), wherein the membranous bacteria produce detectable 
amounts of zeaxanthin. The membranous bacteria also can include a phytoene desaturase 
(e.g., an exogenous phytoene desaturase), wherein the membranous bacteria produce 
detectable amounts of P-carotene. 

The invention also features membranous bacteria (e.g., a Rhodobacter species) 
lacking an endogenous nucleic acid encoding a farnesyl pyrophosphate synthase, wherein 
the bacteria produces detectable amounts of carotenoids. The membranous bacteria also 
can include an exogenous nucleic acid encoding a multifunctional geranylgeranyl 
pyrophosphate synthase. 

In another aspect, the invention features an isolated nucleic acid having at least 
70% sequence identity (e.g., at least 80%) or 90%o) to the nucleotide sequences of SEQ ID 
NO:38, or to a fragment of the nucleic acid of SEQ ID NO:38 at least 15 contiguous 
nucleotides in length. The nucleic acid can encode a P-carotene C4 oxygenase. 
Canthaxanthin can be made by contacting p-carotene with a polypeptide encoded by such 
nucleic acids or a polypeptide having an amino acid sequence at least 80%) identical to the 
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amino acid sequence of SEQ ID NO:39. Astaxanthin can be made by contacting 
zeaxanthin with a polypeptide encoded by such isolated nucleic acids or a polypeptide 
having an amino acid sequence at least 80% identical to the amino acid sequence of SEQ 
IDNO:39. 

5 In another aspect, the invention features membranous bacteria that include an 

exogenous nucleic acid encoding a |3-carotene C4 oxygenase, where the (3-carotene 
oxygenase has an amino acid sequence at least 80% identical to the amino acid sequence 
of SEQ ID NO:39. 

In yet another aspect, the invention features a host cell comprising an exogenous 

1 0 nucleic acid, wherein the exogenous nucleic acid includes a nucleic acid sequence 

encoding one or more polypeptides that catalyze the formation of (3S, 3'S) astaxanthin, 
wherein the host cell produces CoQ-10 and (3S, 3'S) astaxanthin. A method of making 
CoQ-10 and (3S, 3'S) astaxanthin at substantially the same time also is featured. The 
method includes transforming a host cell with a nucleic acid, wherein the nucleic acid 

1 5 includes a nucleic acid sequence that encodes one or more polypeptides, wherein the 

polypeptides catalyze the formation of (3 S, 3'S) astaxanthin; and culturing the host cell 
under conditions that allow for the production of (3S, 3'S) astaxanthin and CoQ-10. The 
method further can include transforming the host cell with at least one exogenous nucleic 
acid, the exogenous nucleic acid encoding one or more polypeptides, wherein the 

20 polypeptides catalyze the formation of CoQ-1 0. 

The invention also features isolated nucleic acid having a nucleotide sequence 
selected from the group consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ 
ID NO:7, SEQ ID NO:9, SEQ ID NO: 1 1 , SEQ ID NO:38, and SEQ ID NO:44. 

An isolated nucleic acid having at least 90% sequence identity to the nucleotide 

25 sequences of SEQ ID NO:44, or to a fragment of the nucleic acid of SEQ ID NO:44 at 

least 60 contiguous nucleotides in length is featured. Geranylgeranyl pyrophosphate can 
be made by contacting isopentenyl pyrophosphate and dimethylallyl pyrophosphate with 
a polypeptide encoded by such a nucleic acid. 

Unless otherwise defined, all technical and scientific terms used herein have the 

30 same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Although methods and materials similar or equivalent to those 
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described herein can be used to practice the invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In case of conflict, the 
present specification, including definitions, will control. In addition, the materials, 
5 methods, and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the 
following detailed description, and from the claims. 

DESCRIPTION OF DRAWINGS 

FIG 1 is a schematic diagram of the biosynthetic pathway for the production of 
10 zeaxanthin and conversion to zeaxanthin di-glucoside. 

FIG 2 is a schematic diagram of the P. stewartii carotenoid gene operon (6586 bp). 
FIG 3 is a chromatogram of astaxanthin production in P. stewartii: :crtW(B. 
aurantiacd). 

DETAILED DESCRIPTION 

1 5 Nucleic Acid Molecules 

The invention features isolated nucleic acids that encode enzymes involved in 
carotenoid biosynthesis. The nucleic acids of SEQ ID NO:l, 3, 5, 7, 9, and 1 1 encode 
zeaxanthin glucosyl transferase (crtX), lycopene P-cyclase (crtY), geranylgeranyl- 
pyrophosphate synthase (crr£), phytoene desaturase (cr/7), phytoene synthase (crtB) and 

20 (3-carotene hydroxylase (crrZ), respectively. A nucleic acid of the invention can have at 
least 76% sequence identity, e.g., 78%, 80%, 85%, 90%, 95%, or 99% sequence identity, 
to the nucleic acid of SEQ ID NO: 1 , or to fragments of the nucleic acid of SEQ ID NO: 1 
that are at least about 33 nucleotides in length; at least 78% sequence identity, e.g., 80%, 
85%, 90%, 95%, or 99% sequence identity, to the nucleotide sequence of SEQ ID NO:3, 

25 or to fragments of the nucleic acid of SEQ ID NO:3 that are at least about 32 nucleotides 
in length; at least 81% sequence identity, e.g., 82%, 85%, 90%, 95%, or 99% sequence 
identity, to the nucleotide sequence of SEQ ID NO:5 , or to fragments of the nucleic acid 
of SEQ ID NO:5 that are at least about 60 nucleotides in length; at least 82% sequence 
identity, e.g., 83%, 85%, 90%, 95%, or 99% sequence identity, to the nucleotide 
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sequences of SEQ ID NO:7 or SEQ ID NO:9 , or to fragments of the nucleic acids of 
SEQ ID NO:7 or SEQ ID NO:9 that are at least about 30 or 23 nucleotides in length, 
respectively; at least 85% sequence identity, e.g., 86%, 90%, 92%, 95%, or 99% sequence 
identity, to the nucleotide sequence of SEQ ID NO: 1 1 , or to fragments of the nucleic acid 

5 of SEQ ID NO: 1 1 that are at least about 36 nucleotides in length. A nucleic acid of the 
invention can have at least 60% sequence identity, e.g., at least 65%, 70%, 75%, 80%, 
85%, 90%, 95%, or 99% sequence identity to the nucleotide sequence of SEQ ID NO:38 
or to fragments of the nucleic acid of SEQ ID NO:38 that are at least about 15 nucleotides 
in length. Such a nucleic acid can encode a (3-carotene C4 oxygenase (crtJV). A nucleic 

10 acid of the invention also can have at least 90% identity to the nucleotide sequence set 
forth in SEQ ID NO:44 or to fragments of the nucleic acid of SEQ ID NO:44 that are at 
least about 60 nucleotides in length. Such a nucleic acid can encode a multifunctional 
geranylgeranyl pyrophosphate synthase. 

Generally, percent sequence identity is calculated by determining the number of 

15 matched positions in aligned nucleic acid sequences, dividing the number of matched 

positions by the total number of aligned nucleotides, and multiplying by 100. A matched 
position refers to a position in which identical nucleotides occur at the same position in 
aligned nucleic acid sequences. Percent sequence identity can be determined for any 
nucleic acid or amino acid sequence as follows. First, a nucleic acid or amino acid 

20 sequence is compared to the identified nucleic acid or amino acid sequence using the 
BLAST 2 Sequences (B12seq) program from the stand-alone version of BLASTZ 
containing BLASTN version 2.0.14 and BLASTP version 2.0.14. This stand-alone 
version of BLASTZ can be obtained from the University of Wisconsin library as well as 
at www.fr.com or www.ncbi.nlm.nih.gov. Instructions explaining how to use the B12seq 

25 program can be found in the readme file accompanying BLASTZ. 

B12seq performs a comparison between two sequences using either the BLASTN 
or BLASTP algorithm. BLASTN is used to compare nucleic acid sequences, while 
BLASTP is used to compare amino acid sequences. To compare two nucleic acid 
sequences, the options are set as follows: -i is set to a file containing the first nucleic acid 

30 sequence to be compared (e.g., C:\seql .txt); -j is set to a file containing the second 

nucleic acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastn; -o is set to any 
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desired file name (e.g., C:\output.txt); -q is set to -1; -r is set to 2; and all other options 
are left at their default setting. For example, the following command can be used to 
generate an output file containing a comparison between two sequences: C:\B12seq -i 
c:\seql .txt -j c:\seq2.txt -p blastn -o c:\output.txt -q -1 -r 2. To compare two amino acid 
sequences, the options of B12seq are set as follows: -i is set to a file containing the first 
amino acid sequence to be compared (e.g., C:\seql .txt); -j is set to a file containing the 
second amino acid sequence to be compared (e.g., C:\seq2.txt); -p is set to blastp; -o is set 
to any desired file name (e.g., C:\output.txt); and all other options are left at their default 
setting. For example, the following command can be used to generate an output file 
containing a comparison between two amino acid sequences: C:\B12seq -i c:\seql .txt -j 
c:\seq2.txt -p blastp -o c:\output.txt. If the target sequence shares homology with any 
portion of the identified sequence, then the designated output file will present those 
regions of homology as aligned sequences. If the target sequence does not share 
homology with any portion of the identified sequence, then the designated output file will 
not present aligned sequences. 

Once aligned, a length is determined by counting the number of consecutive 
nucleotides or amino acid residues from the target sequence presented in alignment with 
sequence from the identified sequence starting with any matched position and ending with 
any other matched position. A matched position is any position where an identical 
nucleotide or amino acid residue is presented in both the target and identified sequence. 
Gaps presented in the target sequence are not counted since gaps are not nucleotides or 
amino acid residues. Likewise, gaps presented in the identified sequence are not counted 
since target sequence nucleotides or amino acid residues are counted, not nucleotides or 
amino acid residues from the identified sequence. 

The percent identity over a particular length is determined by counting the number 
of matched positions over that length and dividing that number by the length followed by 
multiplying the resulting value by 100. For example, if (1) a 1000 nucleotide target 
sequence is compared to the sequence set forth in SEQ ID NO:l, (2) the B12seq program 
presents 200 nucleotides from the target sequence aligned with a region of the sequence 
set forth in SEQ ID NO: 1 where the first and last nucleotides of that 200 nucleotide 
region are matches, and (3) the number of matches over those 200 aligned nucleotides is 
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180, then the 1000 nucleotide target sequence contains a length of 200 and a percent 

identity over that length of 90 (i.e. 180 - 200 * 100 = 90). 

It will be appreciated that a single nucleic acid or amino acid target sequence that 

aligns with an identified sequence can have many different lengths with each length 

5 having its own percent identity. For example, a target sequence containing a 20 

nucleotide region that aligns with an identified sequence as follows has many different 

lengths including those listed in Table 1. 

1 20 
Target Sequence: AGGTCGTGTACTGTCAGTCA (SEQ ID NO:46) 

0 ~ I I! 1 1 I I I I I I I I I I 

Identified Sequence: ACGTGGTGAACTGCCAGTGA (SEQ ID NO:47) 

TABLE 1 



Starting 


Ending 


Length 


Matched 


Percent 


Position 


Position 




Positions 


Identity 


1 


20 


20 


15 


75.0 


1 


18 


18 


14 


77.8 


1 


15 


15 


11 


73.3 


6 


20 


15 


12 


80.0 


6 


17 


12 


10 


83.3 


6 


15 


10 


8 


80.0 


8 


20 . 


13 


10 


76.9 


8 


16 


9 


7 


77.8 



15 It is noted that the percent identity value is rounded to the nearest tenth. For example, 

78.1 1, 78.12, 78.13, and 78.14 is rounded down to 78.1, while 78.15, 78.16, 78.17, 78.18, 
and 78.19 is rounded up to 78.2.. It is also noted that the length value will always be an 
integer. 

Isolated nucleic acid molecules of the invention are at least about 20 nucleotides 
20 in length. For example, the nucleic acid molecule can be about 20-30, 22-32, 33-50, 34 to 
45, 40-50, 60-80, 62 to 92, 50-100, or greater than 150 nucleotides in length, e.g., 200- 
300, 300-500, or 500-1000 nucleotides in length. Such fragments, whether protein- 
encoding or not, can be used as probes, primers, and diagnostic reagents. In some 
embodiments, the isolated nucleic acid molecules encode a full-length zeaxanthin 
25 glucosyl transferase, lycopene p-cyclase, geranylgeranyl pyrophosphate synthase, 
phytoene desaturase, P-carotene hydroxylase, P-carotene C4 oxygenase, or 
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multifunctional geranylgeranyl pyrophosphate synthase polypeptide. Nucleic acid 
molecules can be DNA or RNA, linear or circular, and in sense or antisense orientation. 

Isolated nucleic acid molecules of the invention can be produced by standard 
techniques. As used herein, "isolated" refers to a sequence corresponding to part or all of 
5 a gene encoding a zeaxanthin glucosyl transferase, lycopene p-cyclase, geranylgeranyl- 
pyrophosphate synthase, phytoene desaturase, phytoene synthase, P-carotene 
hydroxylase, p-carotene C4 oxygenase, or multifunctional geranylgeranyl pyrophosphate 
synthase polypeptide, or an operon encoding two or more such polypeptides, but free of 
sequences that normally flank one or both sides of the wild-type gene or the operon in a 

10 naturally-occurring genome, e.g., a bacterial genome. The term "isolated" as used herein 
with respect to nucleic acids also includes any non-naturally-occurring nucleic acid 
sequence since such non-naturally-occurring sequences are not found in nature and do not 
have immediately contiguous sequences in a naturally-occurring genome. 

An isolated nucleic acid can be, for example, a DNA molecule, provided one of 

15 the nucleic acid sequences normally found immediately flanking that DNA molecule in a 
naturally-occurring genome is removed or absent. Thus, an isolated nucleic acid 
includes, without limitation, a DNA molecule that exists as a separate molecule (e.g., a 
cDNA or genomic DNA fragment produced by PCR or restriction endonuclease 
treatment) independent of other sequences as well as recombinant DNA that is 

20 incorporated into a vector, an autonomously replicating plasmid, a virus (e.g., a 

retrovirus, adenovirus, or herpes virus), or into the genomic DNA of a prokaryote or 
eukaryote. In addition, an isolated nucleic acid can include an engineered nucleic acid 
such as a recombinant DNA molecule that is part of a hybrid or fusion nucleic acid. A 
nucleic acid existing among hundreds to millions of other nucleic acids within, for 

25 example, cDNA libraries or genomic libraries, or gel slices containing a genomic DNA 
restriction digest, is not to be considered an isolated nucleic acid. 

Isolated nucleic acids within the scope of the invention can be obtained using any 
method including, without limitation, common molecular cloning and chemical nucleic 
acid synthesis techniques. For example, polymerase chain reaction (PCR) techniques can 

30 be used to obtain an isolated nucleic acid containing a nucleic acid sequence sharing 
identity with the sequences set forth in SEQ ID NOs: 1,3,5, 7, 9, 11, 38, or 44. PCR 
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refers to a procedure or technique in which target nucleic acids are amplified. Sequence 
information from the ends of the region of interest or beyond typically is employed to 
design oligonucleotide primers that are identical in sequence to opposite strands of the 
template to be amplified. PCR can be used to amplify specific sequences from DNA as 
5 well as RNA, including sequences from total genomic DNA or total cellular RNA. 

Primers are typically 14 to 40 nucleotides in length, but can range from 10 nucleotides to 
hundreds of nucleotides in length. General PCR techniques are described, for example in 
PCR Primer: A Laboratory Manual, Ed. by Dieffenbach, C. and Dveksler, G., Cold 
Spring Harbor Laboratory Press, 1995. When using RNA as a source of template, reverse 

1 0 transcriptase can be used to synthesize complimentary DNA (cDN A) strands. 

Isolated nucleic acids of the invention also can be chemically synthesized, either 
as a single nucleic acid molecule or as a series of oligonucleotides. For example, one or 
more pairs of long oligonucleotides (e.g., >100 nucleotides) can be synthesized that 
contain the desired sequence, with each pair containing a short segment of 

1 5 complementary (e.g., about 1 5 nucleotides) DNA such that a duplex is formed when the 

oligonucleotide pair is annealed. DNA polymerase is used to extend the oligonucleotides, 
resulting in a double-stranded nucleic acid molecule per oligonucleotide pair, which then 
can be ligated into a vector. 

Isolated nucleic acids of the invention also can be obtained by mutagenesis. For 

20 example, an isolated nucleic acid that shares identity with a sequence set forth in SEQ ID 
NO: 1,3,5, 7, 9, 11, 38, or 44 can be mutated using common molecular cloning 
techniques (e.g., site-directed mutagenesis). Possible mutations include, without 
limitation, deletions, insertions, and substitutions, as well as combinations of deletions, 
insertions, and substitutions. Alignments of nucleic acids of the invention with other 

25 known sequences encoding carotenoid enzymes can be used to identify positions to 

modify. For example, alignment of the nucleotide sequence of SEQ ID NO: 5 with other 
nucleic acids encoding geranyl geranyl pyrophosphate synthases (e.g., from Erwinia 
uredovord) provides guidance as to which nucleotides can be substituted, which 
nucleotides can be deleted, and at which positions nucleotides can be inserted. 

30 In addition, nucleic acid and amino acid databases (e.g., GenBank®) can be used 

to obtain an isolated nucleic acid within the scope of the invention. For example, any 
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nucleic acid sequence having homology to a sequence set forth in SEQ ID NO: 1, 3, 5, 7, 
9, 1 1, 38, or 44, or any amino acid sequence having homology to a sequence set forth in 
SEQ ID NO: 2, 4, 6, 8, 10, 12, 39, or 45 can be used as a query to search GenBank® 
Furthermore, nucleic acid hybridization techniques can be used to obtain an 
5 isolated nucleic acid within the scope of the invention. Briefly, any nucleic acid having 
some homology to a sequence set forth in SEQ ID NO: 1,3,5, 7, 9", 11, 38, or 44 can be 
used as a probe to identify a similar nucleic acid by hybridization under conditions of 
moderate to high stringency. Moderately stringent hybridization conditions include 
hybridization at about 42°C in a hybridization solution containing 25 mM KP0 4 (pH 7.4), 

10 5X SSC, 5X Denhart's solution, 50 jig/mL denatured, sonicated salmon sperm DNA, 

50% formamide, 10% Dextran sulfate, and 1-15 ng/mL probe (about 5xl0 7 cpm/^g), and 
wash steps at about 50°C with a wash solution containing 2X SSC and 0.1% SDS. For 
high stringency, the same hybridization conditions can be used, but washes are performed 
at about 65°C with a wash solution containing 0.2X SSC and 0.1% SDS. 

1 5 Once a nucleic acid is identified, the nucleic acid then can be purified, sequenced, 

and analyzed to determine whether it is within the scope of the invention as described 
herein. Hybridization can be done by Southern or Northern analysis to identify a DNA or 
RNA sequence, respectively, that hybridizes to a probe. The probe can be labeled with 
biotin, digoxygenin, an enzyme, or a radioisotope such as 32 P or 35 S. The DNA or RNA 

20 to be analyzed can be electrophoretically separated on an agarose or polyacrylamide gel, 
transferred to nitrocellulose, nylon, or other suitable membrane, and hybridized with the 
probe using standard techniques well known in the art. See, for example, sections 7.39- 
7.52 of Sambrook et aL, (1989) Molecular Cloning, second edition, Cold Spring harbor 
Laboratory, Plainview, NY. 

25 

Polypeptides 

The present invention also features isolated zeaxanthin glucosyl transferase (SEQ 
ID NO:2), lycopene (3-cyclase (SEQ ID NO:4), geranylgeranyl pyrophosphate synthase 
(SEQ ID NO:6), phytoene desaturase (SEQ ID NO:8), phytoene synthase (SEQ ID 
30 NO: 1 0), and J3-carotene hydroxylase (SEQ ID NO: 12) polypeptides. In addition, the 

invention features isolated P-carotene C4 oxygenase polypeptides (SEQ ID NO:39) and 
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multifunctional geranylgeranyl pyrophosphate synthase polypeptides (SEQ ID NO:45). 
A polypeptide of the invention can have at least 75% sequence identity, e.g., 80%, 85%, 
90%, 95%, or 99% sequence identity, to the amino acid sequence of SEQ ID NO:2 or to 
fragments thereof; at least 83% sequence identity, e.g., 85%, 90%, 95%, or 99% sequence 
identity, to the amino acid sequence of SEQ ID NO:4 or to fragments thereof; at least 
85% sequence identity, e.g., 90%, 95%, or 99% sequence identity, to the amino acid 
sequence of SEQ ID NO:6 or to fragments thereof; at least 90% sequence identity, e.g., 
90%, 92%o, 95% ; or 99% sequence identity, to the amino acid sequence of SEQ ID NO:8 
or to fragments thereof; at least 89% sequence identity, e.g., 90%, 95%, or 99% sequence 
identity, to the amino acid sequence of SEQ ID NO: 10 or to fragments thereof; at least 
90% sequence identity, e.g., 95%, or 99% sequence identity, to the amino acid sequence 
of SEQ ID NO: 12 or to fragments thereof; at least 60% sequence identity, e.g., 65%, 
70%, 75%>, 80%), 85%), 90%, 95%, or 99% sequence identity, to the amino acid sequence 
of SEQ ID NO:39 or to fragments thereof; or at least 90% sequence identity, e.g., 95% or 
99%) sequence identity, to the amino acid sequence set forth in SEQ ID NO:45 or to 
fragments thereof Percent sequence identity can be determined as described above for 
nucleic acid molecules. 

An "isolated polypeptide" has been separated from cellular components that 
naturally accompany it. Typically, the polypeptide is isolated when it is at least 60% 
(e.g., 70%), 80%), 90%), 95%, or 99%)), by weight, free from proteins and naturally- 
occurring organic molecules that are naturally associated with it. In general, an isolated 
polypeptide will yield a single major band.on a non-reducing polyacrylamide gel. 

The term "polypeptide" includes any chain of amino acids, regardless of length or 
post-translational modification. Polypeptides that have identity to the amino acid 
sequences of SEQ ID NO:2, 4, 6, 8, 10, 12, 39, or 45 can retain the function of the 
enzyme (see FIG 1 for a schematic of the carotenoid biosynthesis pathway). For 
example, geranylgeranyl pyrophosphate synthase can produce geranylgeranyl 
pyrophosphate (GGPP) by condensing together isopentenyl pyrophosphate (IPP) with 
farnesyl pyrophosphate (FPP). Phytoene synthase can produce phytoene by condensing 
together two molecules of GGPP. Phytoene desaturase can perform four successive 
desaturations on phytoene to form lycopene. Lycopene fi-cyclase can perform two 
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successive cyclization reactions on lycopene to form P-carotene. p-carotene hydroxylase 
can perform two successive hydroxylation reactions on P-carotene to form zeaxanthin. 
Alternatively, p-carotene hydroxylase can perform two successive hydroxylation 
reactions on canthaxanthin to form astaxanthin. Zeaxanthin glucosyl transferase can add 
5 one or two glucose or other sugar moieties to zeaxanthin to form zeaxanthin 

monoglycoside or diglycoside, respectively. P-carotene C4 oxygenase can convert the 
methylene groups at the C4 and C4' positions of the p-carotene or zeaxanthin to form 
canthaxanthin or astaxanthin, respectively. Multifunctional geranylgeranyl 
pyrophosphate synthase can directly convert 3 IPP molecules and 1 dimethylallyl 

10 pyrophosphate (DMAPP) molecule to 1 GGPP molecule. 

In general, conservative amino acid substitutions, i.e., substitutions of similar 
amino acids, are tolerated without affecting protein function. Similar amino acids are ( ; . 

those that are similar in size and/or charge properties. Families of amino acids with 
similar side chains are known. These families include amino acids with basic side chains 

15 (e.g., lysine, arginine, or histidine), acidic side chains (e.g., aspartic acid or glutamic 

acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, 
tyrosine, or cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, or tryptophan), P-branched side chains (e.g., 
threonine, valine, or isoleucine), and aromatic side chains (e.g., tyrosine, phenylalanine, 

20 tryptophan, or histidine). 

Mutagenesis also can be used to alter a nucleic acid such that activity of the 
polypeptide encoded by the nucleic acid is altered (e.g., to increase production of a 
particular carotenoid). For example, error-prone PCR (e.g., (GeneMorph PCR ( 
Mutagenesis Kit; Stratagene Inc. La Jolla, CA; Catalog # 600550; Revision #090001) can 

25 be used to mutagenize the B. aurantiaca crtW gene (SEQ ID NO:38) to increase the 

relative amount of di-keto carotenoid (e.g. astaxanthin (3,3'-dihydroxy-p,P-carotene-4,4'- 
dione) or canthaxanthin (P,P-carotene-4,4'-dione)) relative to mono-keto carotenoid (e.g. 
echinone (P,P~carotene-4-one) or adonixanthin (3,3 5 -dihydroxy-P,P-carotene-4-one)) that 
is produced. In general, the nucleic acid to be mutagenized can be cloned into a vector 

30 such as pCR-Blunt II-TOPO (Clontech; Palo Alto, CA) and used as a template for error- 
prone PCR. For purposes of directed evolution, mutation frequencies of 2-7 nucleotides / 
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Kbp template (1-4 amino acids mutations / 333 Amino acids) generally are desired. 
Mutation frequency can be lowered or raised by increasing or decreasing the template 
concentration, respectively. PCR can be performed according to manufacturer's 
recommendations. Mutagenized nucleic acid is ligated into an expression vector, which is 
used to transform a host, and activity of the expressed protein is assessed. For example, 
in the case of the crtW gene, electrocompetent P. stewartii (ATCC 8200) cells can be 
prepared and transformed as described herein, and resulting individual colonies can be 
screened by visual inspection for a phenotypic change from bright yellow pigmentation 
(production of zeaxanthin), yellow orange (production of mono-keto carotenoid) or 
reddish-orange (production of di-keto carotenoid). Production of increased amounts of 
astaxanthin can be confirmed by HPLC/MS. 

Isolated polypeptides of the invention can be obtained, for example, by extraction 
from a natural source (e.g., a plant or bacteria cell), chemical synthesis, or by 
recombinant production in a host. For example, a polypeptide of the invention can be 
produced by ligating a nucleic acid molecule encoding the polypeptide into a nucleic acid 
construct such as an expression vector, and transforming a bacterial or eukaryotic host 
cell with the expression vector. In general, nucleic acid constructs include expression 
control elements operably linked to a nucleic acid sequence encoding a polypeptide of the 
invention (e.g., zeaxanthin glucosyl transferase, lycopene (3-cyclase, geranylgeranyl 
pyrophosphate synthase, phytoene desaturase, phytoene synthase, [3-carotene 
hydroxylase, (3-carotene C4 oxygenase, or multifunctional geranylgeranyl pyrophosphate 
synthase polypeptides). Expression control elements do not typically encode a gene 
product, but instead affect the expression of the nucleic acid sequence. As used herein, 
"operably linked" refers to connection of the expression control elements to the nucleic 
acid sequence in such a way as to permit expression of the nucleic acid sequence. 
Expression control elements can include, for example, promoter sequences, enhancer 
sequences, response elements, polyadenylation sites, or inducible elements. Non-limiting 
examples of promoters include the puf promoter from Rhodobacter sphaeroides 
(GenBank Accession No. El 3945), the nifHDK promoter from R. sphaeroides (GenBank 
Accession No. AF03 1817), and the flxK promoter from R. sphaeroides (GenBank 
Accession No. U86454). 
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In bacterial systems, a strain of E. coli such as DH10B or BL-21 can be used. 
Suitable E. coli vectors include, but are not limited to, pUC18, p.UC19, the pGEX series 
of vectors that produce fusion proteins with glutathione S-transferase (GST), and 
pBluescript series of vectors. Transformed E. coli are typically grown exponentially then 
stimulated with isopropylthiogalactopyranoside (IPTG) prior to harvesting. In general, 
fusion proteins produced from the pGEX series of vectors are soluble and can be purified 
easily from lysed cells by adsorption to glutathione-agarose beads followed by elution in 
the presence of free glutathione. The pGEX vectors are designed to include thrombin or 
factor Xa protease cleavage sites such that the cloned target gene product can be released 
from the GST moiety. 

In eukaryotic host cells, a number of viral-based expression systems can be 
utilized to express polypeptides of the invention. A nucleic acid encoding a polypeptide 
of the invention can be cloned into, for example, a baculoviral vector such as pBlueBac 
(Invitrogen, San Diego, CA) and then used to co-transfect insect cells such as Spodoptera 
frugiperda (Sf9) cells with wild-type DNA from Autographa californica multiply 
enveloped nuclear polyhedrosis virus (AcMNPV). Recombinant viruses producing 
polypeptides of the invention can be identified by standard methodology. Alternatively, a 
nucleic acid encoding a polypeptide of the invention can be introduced into a SV40, 
retroviral, or vaccinia based viral vector and used to infect suitable host cells. 

A polypeptide within the scope of the invention can be "engineered" to contain an 
amino acid sequence that allows the polypeptide to be captured onto an affinity matrix. 
For example, a tag such as c-myc, hemagglutinin, polyhistidine, or Flag™ tag (Kodak) 
can be used to aid polypeptide purification. Such tags can be inserted anywhere within 
the polypeptide including at either the carboxyl or amino termini. Other fusions that 
could be useful include enzymes that aid in the detection of the polypeptide, such as 
alkaline phosphatase. 

Agrobacterium-mediated transformation, electroporation and particle gun 
transformation can be used to transform plant cells. Illustrative examples of 
transformation techniques are described in U.S. Patent No. 5,204,253 (particle gun) and 
U.S. Patent No. 5,188,958 (Agrobacterium). Transformation methods utilizing the Ti and 
Ri plasmids of Agrobacterium spp. typically use binary type vectors. Walkerpeach, C. et 
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al., in Plant Molecular Biology Manual, S. Gelvin and R. Schilperoort, eds., Kluwer 
Dordrecht, CI : 1-19 (1994). If cell or tissue cultures are used as the recipient tissue for 
transformation, plants can be regenerated from transformed cultures by techniques known 
to those skilled in the art. 

5 

Engineered cells. 

Any cell containing an isolated nucleic acid within the scope of the invention is 
itself within the scope of the invention. This includes, without limitation, prokaryotic 
cells such as R. sphaeroides cells and eukaryotic cells such as plant, yeast, and other 
1 0 fungal cells. It is noted that cells containing an isolated nucleic acid of the invention are 
not required to express the isolated nucleic acid. In addition, the isolated nucleic acid can 
be integrated into the genome of the cell or maintained in an episomal state. In other 
words, cells can be stably or transiently transfected with an isolated nucleic acid of the 
invention. 

1 5 Any method can be used to introduce an isolated nucleic acid into a cell. In fact, 

many methods for introducing nucleic acid into a cell, whether in vivo or in vitro, are well 
known to those skilled in the art. For example, calcium phosphate precipitation, 
conjugation, electroporation, heat shock, lipofection, microinjection, and viral-mediated 
nucleic acid transfer are common methods that can be used to introduce nucleic acid 

20 molecules into a cell. In addition, naked DNA can be delivered directly to cells in vivo as 
describe elsewhere (U.S. Patent Nos. 5,580,859 and 5,589,466). Furthermore, nucleic 
acid can be introduced into cells by generating transgenic animals. 

Any method can be used to identify cells that contain an isolated nucleic acid 
within the scope of the invention. For example, PCR and nucleic acid hybridization 

25 techniques such as Northern and Southern analysis can be used. In some cases, 

immunohistochemistry and biochemical techniques can be used to determine if a cell 
contains a particular nucleic acid by detecting the expression of a polypeptide encoded by 
that particular nucleic acid. For example, the polypeptide of interest can be detected with 
an antibody having specific binding affinity for that polypeptide, which indicates that that 

30 cell not only contains the introduced nucleic acid but also expresses the encoded 

polypeptide. Enzymatic activities of the polypeptide of interest also can be detected or an 
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end product (e.g., a particular carotenoid) can be detected as an indication that the ceil 
contains the introduced nucleic acid and expresses the encoded polypeptide from that 
introduced nucleic acid. 

The cells described herein can contain a single copy 3 or multiple copies (e.g., 
about 5, 10, 20, 35, 50, 75, 100 or 150 copies), of a particular exogenous nucleic acid. 
All non-naturally-occurring nucleic acids are considered an exogenous nucleic acid once 
introduced into the cell. The term "exogenous" as used herein with reference to a nucleic 
acid and a particular cell refers to any nucleic acid that does not originate from that 
particular cell as found in nature. Nucleic acid that is naturally-occurring also can be 
exogenous to a particular cell. For example, an entire operon that is isolated from a 
bacteria is an exogenous nucleic acid with respect to a second bacteria once that operon is 
introduced into the second bacteria. For example, a bacterial cell (e.g., Rhodobacter) can 
contain about 50 copies of an exogenous nucleic acid of the invention. In addition, the 
cells described herein can contain more than one particular exogenous nucleic acid. For 
example, a bacterial cell can contain about 50 copies of exogenous nucleic acid X as well 
as about 75 copies of exogenous nucleic acid Y. In these cases, each different nucleic 
acid can encode a different polypeptide having its own unique enzymatic activity. For 
example, a bacterial cell can contain two different exogenous nucleic acids such that a 
high level of astaxanthin or other carotenoid is produced. In addition, a single exogenous 
nucleic acid can encode one or more polypeptides. For example, a single nucleic acid can 
contain sequences that encode three or more different polypeptides. 

Microorganisms that are suitable for producing carotenoids may or may not 
naturally produce carotenoids, and include prokaryotic and eukaryotic microorganisms, 
such as bacteria, yeast, and fungi. In particular, yeast such as Phaffia rhodozyma 
{Xanthophyllomyces dendrorhous), Candida utilis, and Saccharomyces cerevisiae, fungi 
such as Neurospora crassa, Phycomyces blakesleeanus, Blakeslea trispora, and 
Aspergillus sp 9 Archaeabacteria such as Halobacterium salinarium, and Eubacteria 
including Pantoea species (formerly called Erwinia) such as Pantoea stewartii (e.g., 
ATCC Accession #8200), flavobacteria species such as Xanthobacter autotrophics and 
Flavobacterium multivorum, Zymonomonas mobilis, Rhodobacter species such as R. 
sphaeroides and R. capsulatus, E. coli, and E. vulneris can be used. Other examples of 
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bacteria that may be used include bacteria in the genus Sphingomonas and Gram negative 
bacteria in the a-subdivision, including, for example, Paracoccus, Azotobacter, 
Agrobacterium, and Erythrobacter. Eubacteria, and especially R. sphaeroides and R. 
capsulatus, are particularly useful. R. sphaeroides and R. capsulars naturally produce 
5 certain carotenoids and grows on defined media. Such Rhodobacter species also are non- 
pyrogenic, minimizing health concerns about use in nutritional supplements. In some 
embodiments, it can be useful to produce carotenoids in plants and algae such as Zea 
mays, Brassica napus, Lycopersicon esculentum, Tagetes erecta, Haematococcus 
pluvialis, Dunaliella salina, Chlorella prolothecoides, and Neospongiococcum 
10 excentrum. 

It is noted that bacteria can be membranous or non-membranous bacteria. The 
term "membranous bacteria" as used herein refers to any naturally-occurring, genetically 
modified, or environmentally modified bacteria having an intracytoplasmic membrane. 
An intracytoplasmic membrane can be organized in a variety of ways including, without 

15 limitation, vesicles, tubules, thylakoid-like membrane sacs, and highly organized 
membrane stacks. Any method can be used to analyze bacteria for the presence of 
intracytoplasmic membranes including, without limitation, electron microscopy, light 
microscopy, and density gradients. See, e.g., Chory et aL, (1984) J. Bacteriol., 159:540- 
554; Niederman and Gibson, Isolation and Physiochemical Properties of Membranes 

20 from Purple Photosynthetic Bacteria. In: The Photosynthetic Bacteria, Ed. By Roderick 
K. Clayton and William R. Sistrom, Plenum Press, pp. 79-1 1 8 (1978); and Lueking et al., 
(1978) J. Biol. Chem. , 253: 451-457. Examples of membranous bacteria that can be used 
include, without limitation, Purple Non-Sulfur Bacteria, including bacteria of the 
Rhodospirillaceae family such as those in the genus Rhodobacter (e.g., R. sphaeroides 

25 and R. capsulatus), the genus Rhodospir ilium, the genus Rhodopseudomonas, the genus 
Rhodomicrobium, and the genus Rhodophila. The term "non-membranous bacteria" 
refers to any bacteria lacking intracytoplasmic membrane. Membranous bacteria can be 
highly membranous bacteria. The term "highly membranous bacteria" as used herein 
refers to any bacterium having more intracytoplasmic membrane than R. sphaeroides 

30 (ATCC 17023) cells have after the R. sphaeroides (ATCC 17023) cells have been 

(1) cultured chemoheterotrophically under aerobic condition for four days, (2) cultured 
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chemoheterotrophically under anaerobic for four hours, and (3) harvested. Aerobic 
culture conditions include culturing the cells in the dark at 30°C in the presence of 25% 
oxygen. Anaerobic culture conditions include culturing the cells in the light at 30°C in 
the presence of 2% oxygen. After the four hour anaerobic culturing step, the R. 
5 sphaeroides (ATCC 17023) cells are harvested by centrifugation and analyzed. 

Nucleic acids of the invention can be expressed in microorganisms so that 
detectable amounts of carotenoids are produced. As used herein, "detectable' 5 refers to 
the ability to detect the carotenoid and any esters or glycosides thereof using standard 
analytical methodology. In general, carotenoids can be extracted with an organic solvent 

10 such as acetone or methanol and detected by an absorption scan from 400-500 nm in the 
same organic solvent. In some cases, it is desirable to back-extract with a second organic 
solvent, such as hexane. The maximal absorbance of each carotenoid depends on the K 
solvent that it is in. For example, in acetone, the maximal absorbance of lutein is at 451 
nm, while maximal absorbance of zeaxanthin is at 454 nm. In hexane, the maximal 

1 5 absorbance of lutein and zeaxanthin is 446 nm and 450 nm, respectively. High 

performance liquid chromatography coupled to mass spectrometry also can be used to 
detect carotenoids. Two reverse phase columns that are connected in series can be used 
with a solvent gradient of water and acetone. The first column can be a C30 specialty 
column designed for carotenoid separation (e.g., YMCa Carotenoid S3m; 2.0 x 150 mm, 

20 3mm particle size; Waters Corporation, PN CT99S03 1502WT) followed by a C8 Xterraa 
MS column (e.g., Xterraa MS C8; 2.1 x 250 mm, 5mm particle size; Waters Corporation, 
PN 186000459). 

Detectable amounts of carotenoids include 10|ag/g dry cell weight (dew) and ' 
greater. For example, about 10 to 100,000p,g/g dew, about 100 to 60,000jj.g/g dew, about 

25 500 to 30,000|ag/g dew, about 1 000 to 20,000 jig/g dew, about 5,000 to 55,000 p.g/g dew, 
or about 30,000 jxg/g dew to about 55,000 |j.g/g dew. With respect to algae or other plants 
or organisms that produce a particular carotenoid, such as astaxanthin, (3-carotene, 
lycopene, or zeaxanthin, "detectable amount" of carotenoid is an amount that is detectable 
over the endogenous level in the plant or organism. 

30 Depending on the microorganism and the metabolites present within the 

microorganism, one or more of the following enzymes may be expressed in the 
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microorganism: geranylgeranyl pyrophosphate synthase, phytoene synthase, phytoene 
desaturase, lycopene p cyclase, lycopene 8 cyclase, zeaxanthin glycosyl transferase, 
p-carotene hydroxylase, P-carotene C-4 ketolase, and multifunctional geranylgeranyl 
pyrophosphate synthase. Suitable nucleic acids encoding these enzymes are described 
above. Also, see, for example, Genbank Accession No. Y 1 5 1 1 2 for the sequence of 
carotenoid biosynthesis genes of Paracoccus marcusii; Genbank Accession No. D58420 
for the carotenoid biosynthesis genes of Agrobacterium aurantiacum; Genbank Accession 
No. M87280 M99707 for the sequence of carotenoid biosynthesis genes of Erwinia 
herbicolo; and Genbank Accession No. U62808 for carotenoid biosynthesis genes of 
Flavobacterium sp. Strain R1534. 

For example, to produce lycopene in a microorganism that naturally produces 
neurosporene, such as Rhodobacter, an exogenous nucleic acid encoding phytoene 
desaturase can be expressed, e.g., a phytoene desaturase of the invention, and lycopene 
can be detected using standard methodology. Expression of additional carotenoid genes 
in such an engineered cell will allow for production of additional carotenoids. For 
example, expression of a lycopene p-cyclase in such an engineered cell allows production 
of detectable amounts of p-carotene, while further expression of a P-carotene hydroxylase 
allows production of another carotenoid, zeaxanthin. p-carotene and zeaxanthin can be 
detected using standard methodology and are distinguished by mobility on* an HPLC 
column. Zeaxanthin diglucoside can be produced by further expression of zeaxanthin 
glucosyl transferase (crtX) in an organism that produces zeaxanthin. 

Alternatively, canthaxanthin can be produced in organisms that produce phytoene 
by expression of phytoene desaturase, lycopene p-cyclase, and p-carotene C4 oxygenase, 
an enzyme that converts the methylene groups at the C4 and C4' positions of the 
carotenoid to ketone groups. The P-carotene C4 oxygenase from, e.g., Agrobacterium 
aurantiacum or Haematococcus pluvialis can be used. See, GenBank Accession Nos. 
1136630 and X86782 for a description of the nucleotide and amino acid sequences of the 
A. aurantiacum and H. pluvialis enzymes, respectively. The p-carotene C4 oxygenase 
from Brevundimonas aurantiaca also can be used. See, Example 2 for a description of 
the nucleotide and amino acid sequences. In organisms that do not naturally produce 
carotenoids, additional enzymes are required for production of canthaxanthin. 
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Geranylgeranyl pyrophosphate synthase and phytoene synthase can be expressed such 
that the necessary precursors for canthaxanthin synthesis are present. 

Astaxanthin also can be produced in microorganisms that naturally produce 
carotenoids. For example, a Rhodobacier cell can be engineered such that phytoene 
5 desaturase, lycopene P-cyclase, p-carotene hydroxylase, and p-carotene C4 oxygenase are 
expressed and detectable amounts of astaxanthin are produced. Such an organism also 
can express an enzyme that can modify the 3 or 3 ' hydroxyl groups of astaxanthin with 
chemical groups such as glucose (e.g., to produce astaxanthin diglucoside), other sugars, 
or fatty acids. In addition, a R stewartii cell can be engineered such that P-carotene C4 

1 0 oxygenase is expressed and detectable amounts of astaxanthin are produced. Astaxanthin 
can be detected as described above, and has maximal absorbance at 480 nm in acetone. 

Yields of astaxanthin and other carotenoids can be increased by expression of a 
multifunctional geranylgeranyl pyrophosphate synthase, such as that from S. shibatae 
(SEQ ID NO:45) or an Archaebacterial gene from Archaeoglobus fulgidus (GenBank 

15 Accession No. AF120272), in the engineered microorganism. The archaebacteria GGPPS 
gene is a homolog of the endogenous Rhodobacter gene and encodes an enzyme that 
directly converts 3 IPP molecules and 1 DMAPP molecule to 1 GGPPS molecule, thereby 
reducing branching of the carotenoid pathway and eliminating production of other less 
desirable isoprenoids. Further reductions in less desirable metabolites can be obtained by 

20 eliminating endogenous bacteriochlorophyll biosynthesis, which redirects flow into 

carotenoid biosynthesis. For example, the bchO, bchD, and bchl genes can be deleted 
and/or replaced with an Archaebacterial GGPPS gene. Additional increases in yield can 
be obtained by deletion of the endogenous crtE gene or the endogenous crtC, crtD, crtE, 
crtA, crtl, and crtF genes. 

25 Common mutagenesis or knock-out technology can be used to delete endogenous 

genes. Alternatively, antisense technology can be used to reduce enzymatic activity. For 
example, a R. sphaeroides cell can be engineered to contain a cDNA that encodes an 
antisense molecule that prevents an enzyme from being made. The term "antisense 
molecule" as used herein encompasses any nucleic acid that contains sequences that 

30 correspond to the coding strand of an endogenous polypeptide. An antisense molecule 

also can have flanking sequences (e.g., regulatory sequences). Thus, antisense molecules 
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can be ribozymes or antisense oligonucleotides. A ribozyme can have any general 
structure including, without limitation, hairpin, hammerhead, or axhead structures, 
provided the molecule cleaves RNA. 

5 Control of the Ratio of Carotenoids 

The amount of particular carotenoids, such as astaxanthin to canthaxanthin, or 
astaxanthin to zeaxanthin, can be controlled by expression of carotenoid genes from an 
inducible promoter or by use of constitutive promoters of different strengths. As used 
herein, "inducible" refers to both up-regulation and down regulation. An inducible 

10 promoter is a promoter that is capable of directly or indirectly activating transcription of 
one or more DNA sequences or genes in response to an inducer. In the absence of an 
inducer, the DNA sequences or genes will not be transcribed. The inducer can be a 
chemical agent such as a protein, metabolite, growth regulator, phenolic compound, or a 
physiological stress imposed directly by heat, cold, salt, or toxic elements, or indirectly 

1 5 through the action of a pathogen or disease agent such as a virus. The inducer also can be 
an illumination agent such as light, darkness and light's various aspects, which include 
wavelength, intensity, fluorescence, direction, and duration. Examples of inducible 
promoters include the lac system and the tetracycline resistance system from E. coli. In 
one version of the lac system, expression of lac operator- linked sequences is 

20 constitutively activated by a lacR-VPl 6 fusion protein and is turned off in the presence of 
IPTQ In another version of the lac system, a lacR-VP16 variant is used that binds to lac 
operators in the presence of IPTG, which can be enhanced by increasing the temperature 
of the cells. 

Components of the tetracycline (Tc) resistance system also can be used to regulate 
25 gene expression. For example, the Tet repressor (TetR), which binds to tet operator 

sequences in the absence of tetracycline and represses gene transcription, can be used to 
repress transcription from a promoter containing tet operator sequences. TetR also can be 
fused to the activation domain of VP 16 to create a tetracycline-controlled transcriptional 
activator (tTA), which is regulated by tetracycline in the same manner as TetR, i.e., tTA 
30 binds to tet operator sequences in the absence of tetracycline but not in the presence of 
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tetracycline. Thus, in this system, in the continuous presence of Tc, gene expression is 
repressed, and to induce transcription, Tc is removed. 

Alternative methods of controlling the ratio of carotenoids include using enzyme 
inhibitors to regulate the activity levels of particular enzymes. 

Production of Carotenoids 

Carotenoids can be produced in vitro or in vivo. For example, one or more 
polypeptides of the invention can be contacted with an appropriate substrate or 
combination of substrates to produce the desired carotenoid (e.g., astaxanthin). See, FIG. 
1 for a schematic of the carotenoid biosynthetic pathway. 

A particular carotenoid (e.g., astaxanthin, lycopene, p-carotene, lutein, zeaxanthin, 
zeaxanthin diglucoside, or canthaxanthin) also can be produced by providing an 
engineered microorganism and culturing the provided microorganism with culture 
medium such that the carotenoid is produced. In general, the culture media and/or culture 
conditions are such that the microorganisms grow to an adequate density and produce the 
desired compound efficiently. For large-scale production processes, the following 
methods can be used. First, a large tank (e.g., a 100 gallon, 200 gallon, 500 gallon, or 
more tank) containing appropriate culture medium with, for example, a glucose carbon 
source is inoculated with a particular microorganism. After inoculation, the 
microorganisms are incubated to allow biomass to be produced. Once a desired biomass 
is reached, the broth containing the microorganisms can be transferred to a second tank. 
This second tank can be any size. For example, the second tank can be larger, smaller, or 
the same size as the first tank. Typically, the second tank is larger than the first such that 
additional culture medium can be added to the broth from the first tank. In addition, the 
culture medium within this second tank can be the same as, or different from, that used in 
the first tank. For example, the first tank can contain medium with xylose, while the 
second tank contains medium with glucose. 

Once transferred, the microorganisms can be incubated to allow for the 
production of the desired carotenoid. Once produced, any method can be used to isolate 
the desired compound. For example, if the microorganism releases the desired carotenoid 
into the broth, then common separation techniques can be used to remove the biomass 
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from the broth, and common isolation procedures (e.g., extraction, distillation, and ion- 
exchange procedures) can be used to obtain the carotenoid from the microorganism- free 
broth. In addition, the desired carotenoid can be isolated while it is being produced, or it 
can be isolated from the broth after the product production phase has been terminated. If 

5 the microorganism retains the desired carotenoid, the biomass can be collected and the 
carotenoid can be released by treating the biomass or the carotenoid can be extracted 
directly from the biomass. Extracted carotenoid can be formulated as a nutraceutical. As 
used herein, a nutraceutical refers to a compound(s) that can be incorporated into a food, 
tablet, powder, or other medicinal form that, upon ingestion by a subject, provides a 

1 0 specific medical or physiological benefit to the subject. 

Alternatively, the biomass can be collected and dried, without extracting the 
v • carotenoids. The biomass then can be formulated for human consumption (e.g., as a 

dietary supplement) or as an animal feed (e.g., for companion animals such as dogs, cats, 
and horses, or for production animals). For example, the biomass can be formulated for 

1 5 consumption by poultry such as chickens and turkeys, or by cattle, pigs, and sheep. 

Feeding of such compositions may increase yield of breast meat in poultry and may 
increase weight gain in other farm animals. In addition, the carotenoids may increase 
shelf-life of meat products due to the increased antioxidant protection afforded by the 
carotenoids. The biomass also can be formulated for use in aquaculture. For example, 

20 biomass that includes an engineered microorganism that is producing, e.g., astaxanthin 
and/or canthaxanthin, can be fed to fish or crustaceans to pigment the flesh or carapace, 
respectively. Such a composition is particularly useful for feeding to fish such as salmon, 
trout, sea breem, or snapper, or crustaceans such as shrimp, lobster, and crab. 

One or more components can be added to the biomass before or after drying, 

25 including vitamins, other carotenoids, antioxidants such as ethoxyquin, vitamin E, 

butylated hydroxyanisole (BHA), butylated hydroxytoluene (BHT), or ascorbyl palmitate, 
vegetable oils such as corn oil, safflower oil, sunflower oil, or soybean oil, and an edible 
emulsifier, such as soy bean lecithin or sorbitan esters. Addition of antioxidants and 
vegetable oils can help prevent degradation of the carotenoid during processing (e.g., 

30 drying), shipment, and storage of the composition. 
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The invention will be further described in the following examples, which do not 
limit the scope of the invention described in the claims. 

EXAMPLES 

F.xamnle 1 - Cloning of the 7-eaxanthin gene cluster from Pantoea stewartii: 
Genomic DNA from P. stewartii was isolated and digested with restriction enzymes to 
yield genomic DNA fragments approximately 8-10 kB in size. These genomic DNA 
fragments were ligated into a vector cut with the same restriction enzyme, and 
electroporated into electrocompetent E. coli. Transformant colonies were individually 
picked and transferred onto fresh solid media with the appropriate antibiotic selection 
(ampicillin/ampicillin substitute). It was thought that E. coli colonies containing the P. 
stewartii carotenoid genes would appear yellow in color due to the production of 
zeaxanthin pigment or red due to the production of lycopene. Although at least 2000 
ampicillin resistant E. coli transformants were screened, none of the colonies were found 
to contain the P. stewartii carotenoid genes. 

Instead, a second, PCR based method was used to identify and sequence the 
carotenoid (crt) gene cluster from P. stewartii genomic DNA. Degenerate primers were 
designed based on homologous regions identified in the crt genes from Erwinia herbicola 
and Erwinia uredovora. Table 2 provides the position of the crt genes in E. herbicola and 
E. uredovora. 

TABLE 2 



Gene name 


Start of Gene (nucleotide #) 


End of Gene (nucleotide #) 


E. herbicola 


E. uredovora 


E. herbicola 


E. uredovora 


CrtE 


3535 


198 


4458 


1133 


Orf-6 


4521 




5564 




CrtX 


5561 


1143 


6802 


2438 


CrtY 


6799 


2422 


7959 


3570 


Crtl 


7956 


3582 


9434 


5060 


CrtB 


9431 


5096 


10360 


5986 


CrtZ 


10826 

(complement) 


6452 

(complement) 


10296 

complement 


5925 

(complement) 


Orf- 12 


12127 

complement 




10916 

complement 
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The following primers were designed (Table 3) and used in various combinations 
to yield PCR products of varying lengths. P. stewartii genomic DNA was used as 
template. 

TABLE 3 



Primer Name 


Primer Sequence 


SEQID 
NO 


Rs.BCHyl 


5'-ATYATGCACGGCTGGGGWTGGSGMTGGCA - 3' 


13 


P.s. BCHy2 


5' - GGCCARCGYTGATGCACCAGMCCGTCRTGCA - 3' 


14 


P.s.PSl 


5' - CTGATGCTCTAYGCCTGGTGCCGCCA - 3" 


15 


P.S.PS2 


5' - TCGCGRGCRATRTTSGTCARCTG - 3' 


16 


P.s.LBCl 


5' - ATBMTSATGGAYGCSACSGT - 3' 


17 


P.S.LBC2 


5' - YTRATCGARGAYACGCRCTA - 3' 


18 


P.S.LBC3 


5' - RSGGCAGYGAATAGCCRGTG - 3' 


19 


P.S.LBC4 


5' - AACAGCATSCGRTTCAGCAKGCGSA - 3' 


20 


P.S.PD5 


5' - CCGACGGTKATCACCGATCC - 3' 


21 


P.S.PD6 


5" - CTGCGCCSACCAGGTAGAG - 3' 


22 


P.sGGPPSl 


5' - CTYGACGAYATGCCCTGCATGGAC - 3' (MD92) 


23 


P.S.GGPPS2 


5' - GTCGATTTWCCSGCGTCCTICATTG - 3' (MD93) 


24 



PCR was performed in a Gradient Thermocycler, and was started by incubating at 
96°C for 5 minutes, followed by 40 cycles of denaturation at 96°C for 30 seconds, 
annealing at 40 o C/45°C/50 o C/55°C/or 60°C for 105 seconds, and extension at 72°C for 
10 90 seconds, followed by incubation at 72°C for 10 mins. The concentration of MgCl 2 in 
the PCR reactions also was varied and ranged from a final concentration of 1 .5 mM to 6 
mM. Table 4 provides the predicted size of the PCR products with various primer 
combinations. 
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TABLE 4 
Kxnected sizes of PCR Products 



Primer Combination 



BCHyl/BCHy2 



PS1/PS1 



LBC1/LBC3 



LBC1/LBC4 



PCR product length (bp) 



230 



410 



Product Observed 



Yes 



Yes 



320 



PD1/PD2 



PD1/PD4 



LBC2/LBC3 



PD3/PD4 



LBC2/LBC4 



PD5/PD6 



PS1/PS2 



460 



420 



1260 



240 



410 



380 



1200 



BCHyl/BCHy2 



410 



Yes 



Yes 



No 



No 



No 
Yes 



Yes 



Yes 



Yes 



PsGGPPSl/PsGGPPS2 



LBCDownl/PDUpl 



PDDownl/PSUpl 3 



BCHyDownl/PSDownT 



LBCUpl/GGPPSdnl 



230 



470 



470 



300 



700 



1600 



Yes 



Yes 



Yes 



Yes 



Yes 



Yes 



PCR reactions were electrophoresed through agarose gels to estimate sizes of PCR 
5 products and DN A was extracted from the gel using a Qiagen gel extraction kit. The 

purified PCR products were submitted to the Advanced Genetic Analysis Center (AGAC) 
at the University of Minnesota for sequencing. The obtained DNA sequences were 
subjected to BLAST analysis to determine if the sequences were homologous to crt genes 
from other bacteria. Sequence analysis of the 1 .2-kb DNA fragment indicated that there 
1 0 was homology to phytoene desaturase (crtl) genes from E. herbicola and E. uredovora, 
while the 0.47 kB product had homology with the crtE genes from E. herbicola and E. 
uredovora. 

Based on the DNA sequence information generated using the degenerate primers 
and amplified regions of the carotenoid genes from P. stewartii, primers specific for the 
1 5 P. stewartii crt genes were designed and are shown in Table 5. These specific primers 

were used to obtain information upstream and downstream of the DNA regions amplified 
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with the degenerate primers. This rationale was used to extend and obtain DNA sequence 
information about the P. stewartii crt genes. 

TABLES 



P. stewartii primers 



Primer 


Sequence 


SEQ 

ID 

NO 


PsOp.crtE 


5 ' -GGCCG AATTCC AACGATGCTCTGGC AGTT A-3 ' 


25 


PSOp.crtZ(-) 


5'-GGCCAGATCTACTTCAGGCGACGCTGAGAG-3' 


26 


PsOp.crtZ(+) 


5 '-GGCCAG ATCTTACGCGCGGGTAAAGCC AAT-3 ' 


27 


PsOp.crtZ(2+) 


5 '-GGCCTCTAGAATT ACCGCGTGGTTC.TG AAG-3 ' 


28 


PsOp.crtZ(2-) 


5'-GGCCTCTAGATCTGTACGCGCCACCGTTAT-3' 


29 



After unsuccessful attempts at completing the sequence crt gene cluster sequence 
from P. stewartii using PCR, the Universal Genome Walker kit from Clontech was used 
to obtain the complete the sequence of the P. stewartii crtE and crtZ genes. This kit uses a 
PCR based approach. The following primer pairs were synthesized and used for the 
genome walking experiments: GWcrtE2, 5' - 

CATCGGTAAGATCGTCAAGCAACTGAA - 3' (SEQ ID NO:30) and GWcrtEl, 5' - 
GATTTACCTGCATCCTGATTGATGTCT - 3' (SEQ ID NO:31); and GWcrtZl, 5' - 
ATGTATAACCGTTTCAGGTAGCCTTTG - 3' (SEQ ID NO:32) and GWcrtZ2, 5' - 
AATACAGTAAACCATAAGCGGTCATGC - 3' (SEQ ID NO:33). The sequences of 
the crt genes and encoded proteins from P. stewartii were compared to the sequence of 
the crt genes and proteins from E. herbicola and E. uredovora using BLAST under 
default parameters. * See, SEQ ID NOS 1-12 for the nucleotide and amino acid sequences 
of the P. stewartii crt genes. The results of the alignment are provided in Table 6. 

TABLE 6 

Comparison of crt genes and proteins from P. stewartii to K herbicola and E. 



uredovora 





Comparison of nucleotide 
sequence of P. stewartii to 


Comparison of protein sequence 
of P. stewartii to 


Gene 


E. herbicola 


E. uredovora 


E. herbicola 


E. uredovora 


crtE 


59% 


80% 


81% 


83% 


crtX 


56% 


75% 


75% 


74% 


crtY 


58% 


77% 


83% 


82% 
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Gene 



crtl 



crtB 



crtZ 



Comparison of nucleotide 
sequence of P. stewartii to 



| Comparison of protein sequence 
of P. stewartii to 



E. herbicola 



69% 



63% 



65% 



E. uredovora 



81% 



81% 



84% 



E. herbicola 



89% 



88% 



65% 



E. uredovora 



89% 



88% 



88% 



F.*amnle 2 - Cloning of a B-c a rntene C4 Oxygenase from Brevundimonas 
aurantiaca: Degenerate PCR primers for crtW were designed based on crtW genes from 
Bradyrhizobium, Alcaligenes, Agrobacterium aurantiacum, and Paracoccus marcusii. 

5 The primers had the following sequences: (crt W(\ 8 lP.m.) - 

5 TTC ATC ATC GCGC ATG AC 3 ' (SEQ ID NO:34) and cr/^(668P.m.)- 
5AGRTGRTGYTCGTGRTGA (SEQ ID NO:35), and were synthesized by Integrated 
DNA Technologies Inc. (Coralville, IA). PCR was performed in a mastercycler gradient 
machine (Eppendorf) with genomic DNA from B. aurantiaca (ATCC Accession No. 

1 0 1 5266). Reaction conditions included five minutes at 96°C, followed by 30 cycles of 

denaturation at 94°C for 30 sec, annealing at 50°C for 2 min., and extension at 72°C for 2 
min 30 sec, and a final 72°C incubation for 10 min. An approximately 500-bp PCR 
product was obtained and cloned into the vector pCR-Bluntll-TOPO (Invitrogen Corp. 
Carlsbad, CA). 

1 5 Independent clones were sequenced using the universal Ml 3 forward and reverse 

primers. DNA sequencing was carried out at AG AC, University of Minnesota, St. Paul, 
MN. Partial nucleotide sequence of the crtW gene was obtained. Alignment of the partial 
sequence with known crtW genes indicated that the sequences aligned toward the 
N-terminus and C-terminus, respectively, of the crtW genes from Bradyrhizobium, 
20 Alcaligenes, Agrobacterium aurantiacum, and Paracoccus marcusii. The Universal 

Genome Walker kit from Clontech was used to obtain the complete the sequence of the B. 
aurantiaca crtW gene. Primers were synthesized based on the partial sequence and used 
for the genome walking experiments. 

Upon obtaining sequence from the ends of the gene, the following oligonucleotide 
25 primers were synthesized and used to amplify the complete crtW gene from genomic 

DNA: 5'-GCGGCATAGGCTAGATTGAAG-3' (primer 1, Tm = 72°C, SEQ ID NO:36) 
and 5 ' -GCG AGTTCCTTCTC ACCTAT-3 ' (primer 2, Tm = 67°C, SEQ ID NO:37). B. 
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aurantiaca (ATCC 15266) genomic DNA was prepared with the Qiagen genomic-tip 
500G kit (Valencia, CA; Catalog # 10262) following the manufacturers protocol. Briefly, 
30 ml of B. aurantiaca culture were grown overnight at 30°C in ATCC medium 36 
(Caulobacter medium; 2g/l peptone, 1 g/1 yeast extract, 0.2 g/1 MgSO4.7H20). Cultures 
5 were harvested by centrifugation (1 5,000 x g; 10 minutes) and genomic DNA purified 

following the manufacturer's recommended protocol (Qiagen Genomic DNA Handbook 
for Blood, Cultured Cells, Tissue, Mouse Tails, Yeast, Bacteria (Gram- & some Gram+). 
The Expand DNA polymerase system (Roche Molecular Biochemicals, Indianapolis, IN; 
catalog # 1732641) was used in a reaction that included 2 \A of B. aurantiaca genomic 
10 DNA (50 ng/>l), 1 fil of primer 1 (100 pmol/|al), 1 |^1 of primer 2 (100 pmol/pl), 5 \x\ of 

lOx PCR buffer, 1 |il of Expand DNA polymerase (3.5 U/jxl), 2.5 jil of dimethyl sulfoxide 
(DMSO), 2 |_d of dNTP's (10 nmol/j-tl each), and 35.5 jal of dd H 2 0. Reaction conditions 
included five minutes at 96°C, followed by 30 cycles of denaturation at 94°C for 30 sec, 
annealing at 50°C for 2 min., and extension at 72°C for 2 min 30 sec, and a final 72°C 
1 5 incubation for 1 0 min. 

PCR products were electrophoresed through a 0.8% agarose gel and the -0.85 kB 
band was excised from the gel and purified using the Qiagen QIAquick Gel Extraction 
Kit (catalog #28704) following the manufacturer's recommended protocol (QIAquick 
Spin Handbook). Gel-purified PCR product was cloned into the blunt-end cloning site of 
20 pCR-Blunt II-TOPO (Clontech; Palo Alto, CA) to generate pTOPOcrtW. Ligation 
mixtures were electroporated (25 ^iF, 200 Ohms, 12.5 KV/cm) into E. colt DH10B 
electromax cells (Gibco BRL; Gaithersburg, MD; catalog #18290-015). Transformants 
were allowed to recover 60 minutes at 37°C with shaking in 1 ml of SOC medium. Cells 
were plated on LB agar + 50 ng/ml kanamycin and allowed to grow overnight at 37°C. 
25 Transformant colonies were inoculated into 1 ml LB broth + 50 jag/ml kanamycin and 
allowed to grow overnight at 37°C with shaking. Minipreps were prepared using the 
QIAprep Spin Miniprep Kit (50) (catalog #27104) following the manufacturer's protocol 
and the presence of pTOPOcrtW was screened for by restriction analysis with EcoRI. 
EcoRl digests of pTOPOcrtW yielded products of -0.85 Kbp and 3.5 Kbp. 
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The crtW gene was sequenced by AGAC, University of Minnesota, St. Paul, MN. 
The nucleotide sequence of the crtW gene from B. aurantiaca is provided in SEQ ID 
NO:38, and the protein encoded by the crtW gene is provided in SEQ ID NO:39. 

5 ^ p , P 3 - Transfo — tion ^ nTOPOcrtW into Pnntoea s^artii and 

following protocol describes expression of crtW in the zeaxanthin producing host P. 

stevartii This yi elds a transformed host that is capable of producing astaxanthin (,e., 

3 3'-d l hydroxy- P ,P-carotene-4,4'-d 1 one) and adonixanthm (3,3'-dihydroxy-P,P-carotene 
,0 4-one) Electrocompetent P. stevartii (ATCC 8200) cells were prepared by cultunng 50 

m l of a 50/c inoculum of P. stevartii cells in LB at 30°C -with agnation (250 rpm) unul an 
OD, 90 of 0 5- 1 .0 was reached. The bacteria were washed in 50 ml of lOmM HEPES (pH 
7 0) and centrifuged for 10 minutes at I0,000xg. The wash was repeated with 25 ml of 
10mM HEPES (pH 7.0) followed by the same centrifugation protocol. The cells then 
, 5 were washed once in 25 ml of 10% glycerol. Following centrifugation, the cells were 

resuspended in 500 ul of 10% glycerol. Forty ul aliquots were frozen and kept at-80°C 

until use. n 
Plasmid TOPOcrtW was electroporated into electrocompetent P. Stewart,, cells 

(2. jxF 25 KV/cm, 200 Ohms) and plated onto LB agar plates containing 50 pg/ml 

20 kanamycin. As a negative control, pCR-Blun. II-TOPO se.f-liga.ed patent vector also 

was electroporated into P. stewartii and plated onto LB agar plates containing 50 pg/ml 

kanamycin. Individua! colonies oiP. « e wa««::pTOPOcr,W were screened by v.sual 

inspection for a pheno.ypic change from bright yellow pigmentation (production of 

zeaxanthin) to a reddish-orange pigmentation (production of astaxanthin) and chosen for 

25 further p.gmen, analysis. No phenotypic change was noted for individua. colonies of P 

steward pCR-B.un. Il-TOPO, so clones were random., chosen for pigment ana.ys,s. 

Production of astaxanthin was confirmed by HPLC/MS. Carotenoids were 

extracted from cells harvested from 5 day old cultures off. s,ew<t«H::pTOPOcr,W or P 

stewarm - pCR-Blunt II-TOPO (25 ml) grown in LB wim 50 ug/ml kanamycn by 

30 resuspending the washed cel. pellet in 5 ml of acetone. Glass beads were added and the 

mixture was incubated for 60 minutes at room temperature in the dark with occastonal 
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vortexing. The cells were separated from the acetone extract by centrifugation at 15,000 x 
g for 10 minutes. The acetone supernatant then was analyzed by HPLC/MS. 

A Waters 2790 LC system was used with two reverse-phase C30 specialty 
columns designed for carotenoid separation (YMCa Carotenoid S3m; 2.0 X 150 mm, 3 

5 mm particle size; Waters Corporation, PN CT99S03 1502WT)), in tandem. The columns 
were run at room temperature. A gradient of Mobile Phase A (0.1% acetic acid) and 
Mobile Phase B (90% acetone) was used to separate zeaxanthin and astaxanthin 
according to the following gradient timetable: 0 min (10%A, 90%B), 10 min (100%B), 12 
min (10%A, 90%B), 15 min (10%A, 90%B). Flow rate was 0.3 ml/min. Samples were 

10 stored at 20°C in an autosampler and a volume of 25 j^L was injected. A Waters 996 
Photodiode array detector, 350-550 nm, was used to detect zeaxanthin and astaxanthin. 
Under these chromatography conditions astaxanthin eluted at approximately 5.42-5.51 
min and zeaxanthin eluted at approximately 6.22-6.4 min. 

Carotenoid standards were used to identify the peaks. Astaxanthin was obtained 

1 5 from Sigma Chemical Co. (St. Louis, MO) and zeaxanthin was obtained from 

Extrasynthese (France). UV-Vis absorbtion spectra were used as diagnostic features for 
the carotenoids as were the molecular ion and fragmentation patterns generated using 
mass spectrometry. A positive-ion atmospheric pressure chemical ionization mass 
spectrometer was used; scan range, 400-800 m/z with a quadripole ion trap: A 

20 representative HPLC chromatogram is shown in FIG 3, which confirms production of 
astaxanthin in P. stewartii transformed with the B. aurantiaca crtW gene. 

Example 4 - Simultaneous Production of CoQ-10 and f3S, 3'S) Astaxanthin in 
a Microoreanism: Although Phaffia rhodozyma is not capable of producing the 3S, 3'S 

25 isoform of astaxanthin, it is known to produce Coenzyme Q-10. This compound has been 
found to have particularly high value as a nutraceutical. The current invention is of 
particular value since R. sphaeroides is known to produce Coenzyme Q-10 and has been 
transformed with genes that, while novel, are nevertheless homologous to native genes in 
the MABP. Consequently, the described organism can be expected to simultaneously 

30 produce both Coenzyme Q-10 and (3S, 3'S)-ATX. This is the first described production 
of the production of both (3S, 3'S)-ATX and Coenzyme Q-10 in a single microbial host. 
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10 



15 



20 



25 



30 



The identified of (3S, 3'S)-ATX can be accomplished as described by Maoka, 
T etal l_Chromato£L 318:122-124 (1985). Briefly, this consols of extracts of the 
carotenoid pigments by contacting the biomass with a suitable organic solvent such as 
actetone or dichloromethane. The carotenoid extract is then dried under a stream of 
liquid nitrogen and resuspended in a solvent of „-hexane-dichloromethane-ethanol 
(48-16-0 6) The extract is applied to a Sumipax OA-2000 (particle size lOuM) 250 x 
m m I.D. (Sumitomo Chemicals, Osaka, Japan) chiral resolution HPLC column at a flow 
rate of 0.8 ml/min. Generally, the order of elution is expected to be (3R, 3'R)-ATX 
followed by (3R, 3'S; 3S, 3'R)-ATX followed by (3S, 3'S)-ATX A similar separate „ 

described in Maoka, T., et al. C^B^U^ 83B:121-124 (1986). Briefly. 

this consists of isolation of the carotenoid, derivitization to the dibenzoate form with 

benzoyl chloride and separation of the enantiomers using a Sumipax OA-2000 chiral 

resolution HPLC column. 

r _ nn . r „ . _ T».n rf «rmation ofthe mnltifunctjon^ GGPP synthase from 
4 _ rcHeo^ fundus into RHo^acter strain pp^r with the crtY and crtl gene s 
ijm ?jmm g ^artu inserted into the chromosome: The following protocol 
describes the generation of a p-carotene producing strain of*, sphaeroides (ATCC 
35053) a facultative photoheterotroph, in winch the ppsr gene was deleted by using the 
in-frame deletion procedure of Higuchi, R., et al, Nucl^cAc^ 16: 7351-7367 to 
generate strain AREG. Table 7 describes the strains and plasmids used in this example. 
PpsR is a transcription factor that is involved in the repression of photosysem gene 
expression under aerobic growth conditions. The region ofthe chromosome that included 
the native tspO, crtC, crtD, crtE and crtF genes of AREG were replaced by the lycopene 
P cyclase (crtY) and phytoene desaturase (crtl) genes from P. stewrtii using the 
procedure of Oh and Kaplan, Biochemistry 38:2688-2696 (1999); and Lenz, et al., L 
Baoeriologx 176:4385-4393 (1994), to generate the strain AREG(A5:Y1). Briefly, the 
crtY and crt I genes were cloned into pLOl, a suicide vector for R. sphaeroides 
containing the Kanamycin resistance gene and the Bacillus subtilis sacB gene encodmg 
sensitivity to sucrose. DNA fragments flanking the crtYI genes and identical in sequence 
to -500 bp internal fragments ofthe R. sphaeroides tspO and crtF genes were then cloned 
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into pLOl . These flanking DNA regions correspond to the desired region for insertion of 
the crtYI genes. Insertion of the crtYl genes in AREG was confirmed using PCR analyses 
and appropriate PCR primers specific to the crt YI genes as well as flanking regions of the 
R.sphaeroides genome. The crtYI (P. stewartii) insertion and tspO, crtC, crtD, crtE and 
5 crtF (R. sphaeroides) deletion resulted in the lack of native carotenoid production and a 
change in the pigmentation from red to green, confirming the insertion event. 



TABLE 7 

Description of Rhodobacter Strains and Plasmids 



Strain 


Description 


Major 
Carotenoid 
Produced 


Comments 


AREG 


ATCC 35053; 

ppsR regulatory mutant 


Sphaeroidenone 

(Native 

Carotenoid) 


Regulatory 
mutant 


AREG(A5:YI) 


CrtY and crtl genes of P. 
stewartii replaced 5 host 
genes (tspO, crtC, crtD, 
crtE and crtF) on 
chromosome 


None 


P -carotene 
biosynthetic 
genes placed in 
chromosome. No 
carotenoid 
production 
because of crtE 
deletion 


AREG(A5:YI)::pP 
Ctrl 


Control vector introduced 
into AREG(A5:YI) host 


None 


Control vector 
contains rrnB 
promoter but no 
biosynthetic 
genes 


AREG(A5:YI)::pP 
gps 


gps gene of A. fulgidus 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) host 


P-Carotene 


gps gene on 
plasmid 

complements crtE 

deletion. 

Complete 

pathway for p- 

carotene 

production 
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Strain 



AREG(A5:YI) 
(AA:gps) 



AREG(A5:YI) 

(AA:gps) 

::pPWZ 



AREG(A5:YI) 

(AAigps) 

:: P PgpsWZ 



crt W and crtZ genes 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) (AA:gps) 
host 



Comments 




Astaxanthin 



gps gene inserted 
into genome 
complements crtE 
deletion. 
Complete 
pathway for P- 
carotene 
production 



gps, crtW and criZ genes 
inserted into pPctrl control 
vector and introduced into 
AREG(A5:YI) (AA:gps) 
host 



Astaxanthin 



Plasmids 

PBBR1MCS2 

PPctrl 



errand crtZ 
genes convert (3- 
carotene into 
astaxanthin 



Additional copies 
of A. fulgidus gps 
gene on plasmid 
increases 
production of 
astaxanthin 



PPgps 



PPWZ 



Genetic elements inserted 

None 



rrnB promoter 



rrnB promoter, A. fulgidus 
gps 



PPgpsWZ 



pr" — 

rrnB promoter, P. stewartu 

crtZ, 

B. aurantiacum crtW 



rrnB promoter, A. fulgidus 
gps 

P. stewartii crtZ, 

B. aurantiacum crtW 



The pPctrl vector was constructed by inserting a copy of the R. sphaeroides rrnB 
prom o.er (GenBank Accession # X53854; rrnBP) into the vector P BBR1MCS2 (Germanic 
Accession # U23751 ). The rrnB promoter was isolated from the vector pTEX24 (S. 
Kaplan) by a ArtD restriction enzyme digest, which released the promoter as a 363 bp 
fragment. Tfris fragment was gel purified from a 2% Tris-ace,a,e-EDTA (TAB) agarose 
gel To prepare .he pBBRl MCS2 vector for ligation, it also was digested with ftamHI 
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and the enzyme heat inactivated at 80°C for 20 minutes. The digested vector was 
dephosphorylated with shrimp alkaline phosphatase (Roche Molecular Biochemicals, 
Indianapolis, IN), and gel purified from a 1% TAE-agarose gel. The prepared vector and 
the rrnB fragment were ligated using T4 DNA ligase at 16°C for 16-hours to generate the 
5 plasmid pPctrl. One pL of ligation reaction was used to electroporate 40 pL ofE. coli 
ElectromaxTM DH10BTM cells (Life Technologies, Inc., Rockville, MD). 

Electroporated cells were plated on LB media containing 25 pg/mL of kanamycin 
(LBK). pPctrl DNA was isolated from cultures of single colonies and was digested with 
Hind III to confirm the presence of a single insertion of the rrnB promoter. The sequence 
10 of pPctrl also was confirmed by DNA sequencing. 

The multifunctional GGPP synthase (gps) gene from A. fulgidus (GenBank 
Accession No. AF1 20272) was cloned into the multiple cloning site of pPctrl to generate 
the construct pPgps. 

Electrocompetent AREG(A5:YI) cells were prepared as follows: 5 ml cultures 
15 were inoculated using Sistrom's media supplemented with trace elements, vitamins 

(O'Gara, et al., J. Bacteriol. 180:4044-4050 (1988); Cohen-Bazire, et al. J. Cell. Comp. 
Physiol. 49:25-68 (1957)) and 0.4% glucose as a carbon source, and grown overnight at 
30°C with shaking. This culture was diluted 1/100 in 300 mL of the same media and 
grown to an OD660 of 0.5-0.8. The cells were chilled on ice for 10 minutes and then 
20 centrifuged for 6 minutes at 7,500 g. The supernatant was discarded and the cell pellet 
was resuspended in ice-cold 10% glycerol at half of the original volume. The cells were 
pelleted by centrifugation for 6 minutes at 7,500 g. The supernatant was again discarded 
and cells were resuspended in ice cold 10% glycerol at one quarter of the original volume. 
The last centrifugation and resuspension steps were repeated, followed by centrifugation 
25 for 6 minutes at 7,500 g. The supernatant was decanted and the cells resuspended in the 
small volume of glycerol that did not drain out. Additional ice-cold 10% glycerol was 
added to resuspend the cells if necessary. Forty pL of the resuspended cells was used in a 
test electroporation (see below) to determine if the cells needed to be concentrated by 
centrifugation or diluted with 10%. ice-cold glycerol. Time constants of 8.5-9.0 resulted 
30 in good transformation efficiencies. Once an acceptable time constant was achieved, cells 
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j . _ j ot oo°r All v/ater used for media 
were aliquoted into cold microfuge tubes and stored at -80 C. All w 

and glycerol was 18 Mohm or higher. 

fAB cr^-Yn was carried out as follows. One uL of pPgps or 
Electroporation of AREG(A5 . Y 1) was carricu 

pPctrl vector DNA was gen.iy mtxed into 40 pL of AR£G(A5:YI) e.ectrocompeten. ce.,s, 
lh then we, transferred to - e.ectroporauon covette with a 0.2 cM 
Electroporafions were conducted using a Biorad Gene Pu,ser II (Btorad Hercules, CA) 
1 selgs a, 2.5 kV of potential, 400 ohms of reststance, and 25 pF of capactan . 
C I— etea in 400 pL SOC med.a a, 30°C for 6,6 hours. The eells were then 
200 PL per plate, on LB medium containing 50 pg/m, kanamycin and mcubated ar 

30°C for 5-6 days. AuwrfAVYT. 
After incubation, greenish eolonies were observed on p.ates of AREG(A5.YI) 

transfotmed with pPctr. plasmid DNA. The co.onies tha, appeared on plates of 

AREG(A5:YI) transformed with pPgps plasmid DNA appeared yellow. The yeUow 

p.gmentation was indicattve of f-carotene producfion in AREG(A5:YI) expressmg Ore 

with vitamins, trace Cements and 0.4% glueose as we., as 50 pg/m. kar„ a 30 
with shaking for 24-48 hours. Caro.eno.ds were extracted and subbed to LCM 
ana.ysts as desenbed above. Under the ehromarography condmons used, P;— 
e.uted a, approximate* .3.87,4.2 min. P-carotene standard (Srgma chem.ca, St. Louts, 
MO) was led to identify «he peaks. The UV-Vis absorption spectra and the rerenuon 

, A-^rteur features for B-carotene identification in 
time using HPLC were used as diagnostic teatures P 

i .... ^prrnc DNA as well as the molecular ion and 
AREG(A5 : YI) transformed with pPgps DNA, as wen 

fragment patterns generated during mass spectrometry. Thus, the producUono > 
carotene was confirmed in AREG(A5:YI) expressing the A fit**, *> ■» *- P** 

<_nV a n, f on.nfinn n f.be p-ra r o,ene C-4 ^m^m^m 

l^^^^^o! »ne SP s genefrom^^gto 

0 ^t^us*^ ^ Prot ° C °' ***** ** 
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generation of an astaxanthin producing strain of R. sphaeroides using AREG(A5:YI) ? 
described above. See also Table 7 for further description of the strains and plasmids that 
were used in this example. Using the gene insertion method described by Higuchi, R. 5 et 
al, Nucleic Acid Res. 16: 7351-7367, the crlA gene of AR£G(A5:YI) was replaced by the 
5 gps gene from A. fulgidus to generate the strain AREG(A5:YI)(AA:gps). 

Electrocompetent cells AREG(A5:YI)(AA:gps) were generated as described above. 

The construct pPgpsWZ was produced by cloning the crtW gene from B. 
aurantiacum, the crtZ gene from P.stewartii, and the gps gene from A fulgidus into the 
pPctrl plasmid using appropriate restriction enzymes. The construct pPWZ was produced 

10 by cloning the crtW gene from B. aurantiacum and the crtZ gene from P.stewartii into the 
pPctrl plasmid using appropriate restriction enzymes. 

The pPWZ or pPgpsWZ constructs were electroporated into electrocompetent 
AREG(A5:YI)(AA:gps) as described earlier to generate AREG(A5:YI)(AA:gps)::pPWZ or 
AREG(A5:YI)(AA:gps)::pPgpsWZ, respectively. Transformation mixtures were plated 

1 5 out onto LB plates containing 50 ng/ml kanamycin. PCR analyses using PCR primers 

specific for crtZ were used to confirm the presence of the pP WZ or pPgpsWZ plasmids in 
AREG(A5:YI)(AA:gps). 

Single colonies of AREG(A5:YI)(AA:gps)::pPWZ or 
AREG(A5:YI)(AA:gps)::pPgpsWZ were grown up in media supplemented with 50 (ag/ml 

20 kanamycin as described earlier. Cell pellets were washed with distilled water and then 

carotenoids were extracted using acetone: methanol (7:2) at 30°C for 30 mins with shaking 
at 225 rpm. Carotenoid analysis was performed using LCMS analysis described above. 
The UV-Vis absorption spectra and the retention time using HPLC were used as 
diagnostic features for astaxanthin identification in AREG(A5:YI)(AA:gps)::pPWZ and 

25 AREG(A5:YI)(AA:gps)::pPgpsWZ, as well as the molecular ion and fragmentation 
patterns generated during mass spectrometry. The production of astaxanthin was 
confirmed in both AREG(A5:YI)(AA:gps)::pPWZ and AREG(A5:YI)(AA:gps)::pPgpsWZ. 
Increased astaxanthin production was observed in AREG(A5:YI)(AA:gps)::pPgpsWZ. 
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7, Cloning and senten cing of a ^Un^^^ 

<>r,ny) f eran>l P > iwf pr r PP 1 (5 ' CCAY G AYGA YATW ATGG A3 ' , SEQ ID 

Degenerate primer sequences MFGGPP1 (5 CCAYUAiu 

N0 40) and MFGGPP2 (5 ' YTT YTTV CC YT YCCT AAT3 SEQ ID NO:41) were 
deSlg ned based on conserved sequences in gps gene sequences from Sulfolotus 
solfotaricus and Sulfolobus acidocaldarius and synthesized by Integrated DNA 
Technologies (Coralville, IA). PGR was performed in a ^ 
(Eppendorf) with genomic DNA from S. shitatae (ATCC AccessionNo. 51178, lot 
\ 162977) Reaction conditions included five minutes at 96°C, followed by 30 cycles of 
saturation at 94*C for 30 sec, annealing at 50 + 10°C for 60 sec, and — n at 
72°C for 90 sec, and a final 72°C incubation for 10 min. An approxunately 500-bp PGR 
pr oduct was obtained and cloned into the vector pC-Buntll-TOPO (Invitrogen Corp. 
Carlsbad, C A). 

i „ tUo , iri ; vf . r <;al Ml 3 forward and reverse 
Independent clones were sequenced using the universal MU iorw 

pnmers. DNA sequencing was carried out at the AGAC, University of Minnesota, 
St Paul MN DNA sequence analysis of this PCR product indicated similarity to the gps 
genes from S. sulfotaricus and S. aciOocaUarius. The Universal Genome Walker kit 
(Clontech) was used to obtain more of the gps gene sequence flanking the ongmal PGR 
pr oduct from S. sHitatae. Pnmers were synthesized based on the partial sequence and 
used for genome walking experiments. 

The fol.ow.ng strategy was osed ,0 complete.y sequence the S. skibacae gps gene. 
The ERWCRTS hotnolog was observed upstream of the S sulfcaricus gps gene. The 
UDP-A-aeetylglucosamine-Dolichyl-phosphate-N-acetylglueosamtne 

phosphotransferase gene was present downstream of the gps gene in both S. sulfonic* 
and S. acHoca^vs. Primers were des.gned based on the sequence of the two genes 
SsDolidn (5'ACAGCGTTGGACACTCAG 3". SEQ ID NO:42) and SsERCRTup (5 
GCOTCGATAATGGAAOTGAG 3% SEQ ID NO:43) of the gps gene. An approxtmately 
2 kb PCR product was amplified using the SsDohdn and SsERCRTup primers and 
genomic DNA from S. skiba,ae. This PCR product was cloned into the vector pC-Bun.II- 
TOPO as described above and sequenced using the universal M13 forward and reverse 
primers. The nucleotide sequence of the gps gene from S. skiborne is presented m SEQ 
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ID NO: 44, and the amino acid sequence of the protein encoded by the gps gene is 
presented in SEQ ID NO:45. 

OTHER EMBODIMENTS 

It is to be understood that while the invention has been described in conjunction 
with the detailed description thereof, the foregoing description is intended to illustrate and 
not limit the scope of the invention, which is defined by the scope of the appended claims. 
Other aspects, advantages, and modifications are within the scope of the following 
claims. 
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WHAT IS CLAIMED IS: 

1 An isolated nucleic acid having at least 76% sequence identity to the nucleotide 
' sequence of SEQ ID NO:l or to a fragment of SEQ ID NO:l at least 33 contiguous 
nucleotides in length. 

5 2. The isolated nucleic add of claim 1. said nucleic acid having a, ,eas, 80% sequence 
identity to the nucleotide sequence of SEQ ID NO . 1 . 

3 . The isolated nucleic acid of claim 1 , said nucleic actd havtng a, least 85% sequence 
1 o identity to the nucleotide sequence of SEQ ID NO: 1 . 

4. The isolated nucleic acid of claint 1 , said nucletc acid having at .east 90% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1 

,5 5. The Mated nucleic acid of claint 1, said nucleic actd havtng a. leas. 95% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1 . 

6. An expression vector comprising the nucleic acid of claim 1 operably linked ,o an 
expression control element. 

7. An isolated nucleic acid encoding a zeaxanthin glucosyl transferase polypeptide at 
least 75% identical to the amino acid sequence of SEQ ID NO:2. 

8 An isolated nucleic acid having at least 78% sequence identity to the nucleotide 
25 ' sequence of SEQ ID NO:3 or to a fragment of SEQ ID NO:3 at least 32 contiguous 

nucleotides in length. 

9. The isolated nucleic acid of claim 8, said nucleic acid having at least 80% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

30 



20 
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10. The isolated nucleic acid of claim 8, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

11. The isolated nucleic acid of claim 8, said nucleic acid having at least 90% sequence 
5 identity to the nucleotide sequence of SEQ ID NO:3. 

12. The isolated nucleic acid of claim 8 5 said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:3. 

13. An expression vector comprising the nucleic acid of claim 8 operably linked to an 
expression control element. 

14. An isolated nucleic acid encoding a lycopene (3-cyclase polypeptide at least 83% 
identical to the amino acid sequence of SEQ ID NO:4. 

15. An isolated nucleic acid having at least 81% sequence identity to the nucleotide 
sequence of SEQ ID NO:5 or to a fragment of SEQ ID NO:5 at least 60 contiguous 
nucleotides in length. * 

20 16. The isolated nucleic acid of claim 15, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

The isolated nucleic acid of claim 15 3 said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

The isolated nucleic acid of claim 15, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:5. 

19. An expression vector comprising the nucleic acid of claim 15 operably linked to an 
30 expression control element. 



10 
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25 



18. 
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20 An isolated nucleic acid encoding a geranylgeranyl pyrophosphate synthase 
polypepnde a, least 85% identical to the amino acid sequence of SEQ ID NO:6. 

2! An tsolated nucleic acid having a. least 82% sequence identity to the nucleotide 
sequence of SEQ ID NO:7 or to a fragment of SEQ ID NO:7 at least 30 contiguous 
nucleotides in length. 

22. The isolated nucleic acid of claim 21, said nucleic acid having at leas, 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:7. 

23. The tsolared nucleic acid of claim 2, , said nucleic acid having at leas. 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:7. 

24. The isolated nucleic acid of claim 21, said nucleic acid having at.eas. 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:7. 

25. An expression vector comprising the nucleic acid of claim 2, operably linked to an 
expression control element. 

26. An isolated nucleic acid encoding a phytoene desaturase polypeptide a. leas. 90% 
identical to the amino acid sequence of SEQ ID NO:8. 



27 An isolated nucleic acid having at least 82% sequence identity to the nucleotide 
' sequence of SEQ ID NO:9 or to a fragment of SEQ ID NO:9 at least 23 contiguous 
25 nucleotides in length. 

28. The isolated nucleic acid of claim 27, sa ld nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 



30 



29. The isolated nucleic acid of claim 27, said nuc.etc acid having a. leas, 90% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 
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30. The isolated nucleic acid of claim 27, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:9. 

5 31 . An expression vector comprising the nucleic acid of claim 27 operably linked to an 
expression control element. 

32. An isolated nucleic acid encoding a phytoene synthase polypeptide at least 89% 
identical to the amino acid sequence of SEQ ID NO: 10. 

10 

33. An isolated nucleic acid having at least 85% sequence identity to the nucleotide 
sequence of SEQ ID NO: 11 or to a fragment of SEQ ID NO: 1 1 at least 36 contiguous 
nucleotides in length. 

15 34. The isolated nucleic acid of claim 33, said nucleic acid having at least 85% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1 1 . 

35. The isolated nucleic acid of claim 33, said nucleic acid having at least 90% sequence 
identity to the nucleotide sequence of SEQ ID NO: 1 1 . 

20 

36. The isolated nucleic acid of claim 33, said nucleic acid having at least 95% sequence 
identity to the nucleotide sequence of SEQ ID NO:l 1 . 

37. An expression vector comprising the nucleic acid of claim 33 operably linked to an 
25 expression control element. 

38. An isolated nucleic acid encoding a P-carotene hydroxylase polypeptide at least 90% 
identical to the amino acid sequence of SEQ ID NO: 12. 

30 39. Membranous bacteria comprising at least one exogenous nucleic acid encoding 

phytoene desaturase, lycopene p-cyclase, p-carotene hydroxylase, and P-carotene C4 
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oxygenase, wherein expression of said a. leas, one exogenous nucleic acid produces 
detectable amounts of astaxanthin in said membranous bactena. 

40. The membranous bacteria of claim 39, wherein the amino acid sconce of said 
phytoene desaturase is at leas, 90% identical ,0 me amino acid sequence of SEQ ID 
NO:8. 

41 The membranous bacteria of claun 39, wherein the amino acid sequence of said 
lycopene ^cyclase is at least 83% identical to the amino acid sequence of SEQ ID 
NO:4. 

42 The membranous bacteria of claim 39, wherein the amino acid sequence of said 
p-caro,ene hydroxylase is a, leas, 90% identical to the amino acid sequence of SEQ 
IDNO.12. 

43 The membranous bacteria of claim 39, wherein said membranous bacteria further 
comprises an exogenous nucleic acid encoding geranylgeranyl pyrophosphate 
synthase. 

44. The membranous bactena of claim 39, wherein said membranous bacteria lacks 
endogenous bacteriochlorophyll biosynthesis. 

45. The membranous bacteria of claim 43, wherein said exogenous nucleic acid encodes a 
multifunctional geranylgeranyl pyrophosphate synthase. 

46 Themembranousbacteriaofclaim45,whereintheaminoacidse q uenceofsaid 
' multifunctional geranylgeranyl pyrophosphate synthase is at least 90% identical to the 
amino acid sequence of SEQ ID N0.45. 



-47- 



wo 0Z07939SA2_1A> 



" 'WO 02/079395 PCT/US02/02124 

t 

47. The membranous bacteria of claim 39 5 wherein the amino acid sequence of said 

p-carotene C4 oxygenase is at least 80% identical to the amino acid sequence of SEQ 
IDNO:39. 

5 48. The membranous bacteria of claim 39, wherein said membranous bacteria further 
comprise an exogenous nucleic acid encoding phytoene synthase. 

49. The membranous bacteria of claim 48, wherein the amino acid sequence of said 
phytoene synthase is at least 89% identical to the amino acid sequence of SEQ ID 

10 NO: 10. 

50. The membranous bacteria of claim 39, wherein said membranous bacteria are a 
Rhodobacter species. 

15 51. Membranous bacteria, said membranous bacteria comprising an exogenous nucleic 
acid encoding a phytoene desaturase having an amino acid sequence at least 90% 
identical to the amino acid sequence of SEQ ID NO:8, and wherein said membranous 
bacteria produces detectable amounts of lycopene. 

20 52. The membranous bacteria of claim 51, wherein said membranous bacteria further 
comprise a lycopene (3-cyclase, and wherein said membranous bacteria produce 
detectable amounts of P-carotene. 

53. The membranous bacteria of claim 52, wherein said membranous bacteria further 
25 comprise a p-carotene hydroxylase, and wherein said membranous bacteria produce 

detectable amounts of zeaxanthin. 

54. Membranous bacteria comprising at least one exogenous nucleic acid encoding 
phytoene desaturase, lycopene P-cyclase, and p-carotene C4 oxygenase, wherein 

30 expression of said at least one exogenous nucleic acid produces detectable amounts of 

canthaxanthin in said membranous bacteria. 
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55. A composition compnsmg an engineered Rhodobacter cell, wherein said cell 
produces a detectable amount of astaxanthin or canthaxanthin. 

56 The composition of claim 55, wherein said engineered Rhodobacter cell comprises at 
least one exogenous nucleic acid encoding phytoene desaturase, lycopene P-cyclase, 
P-carotene hydroxylase, and p-carotene C4 oxygenase. 



10 



57. The composition 



of claim 55, wherein said composition is formulated for aquaculture. 



58. The composition of claim 57, wherein said composition pigments the flesh offish or 
the carapace of crustaceans after ingestion. 



59. The composition 
15 consumption. 

60. The composition 
feed. 



of claim 55, wherein said composition is formulated for human 



of claim 55, wherein said composition is formulated as an animal 



20 61 . The composition of claim 60, wherein said ammal feed is formulated for consumpUon 
by chickens, turkeys, cattle, swine, or sheep. 

60 A method of making a nutraceutical, said method comprising extracting carotenoids 
from an engineered Rhodobacter cell, said engineered Rhodobacter cell compnsmg at 
25 least one exogenous nucleic acid encoding phytoene desaturase, lycopene P-cyclase, 

P-carotene hydroxylase, and P-carotene C4 oxygenase, and wherein said Rhodobacter 
cell produces detectable amounts of astaxanthin. 

63 Membranous bacteria, said membranous bacteria comprising an exogenous nucleic 
30 acid encoding a lycopene P-cyclase having an amino acid sequence at least 83% 

identical to the amino acid sequence of SEQ ID NO:4. 
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64. The membranous bacteria of claim 63, said membranous bacteria further comprising a 
phytoene desaturase, wherein said membranous bacteria produces detectable amounts 
of P-carotene. 

5 

65. The membranous bacteria of claim 64, said membranous bacteria further comprising a 
p-carotene hydroxylase, wherein said bacteria produces detectable amounts of 
zeaxanthin. 

10 66. Membranous bacteria, said membranous bacteria comprising a p-carotene 

hydroxylase having an amino acid sequence at least 90% identical to the amino acid 
sequence of SEQ ID NO: 12. 

67. The membranous bacteria of claim 66, said membranous bacteria further comprising a 
1 5 lycopene P-cyclase, and wherein said membranous bacteria produces detectable 

amounts of zeaxanthin. 

68. The membranous bacteria of claim 67, said membranous bacteria further comprising a 
phytoene desaturase, wherein said membranous bacteria produces detectable amounts 

20 of P-carotene. 

69. Membranous bacteria, said bacteria lacking an endogenous nucleic acid encoding a 
farnesyl pyrophosphate synthase, and wherein said bacteria produce detectable 
amounts of carotenoids. 

25 

70. The membranous bacteria of claim 69, wherein said bacteria comprise an exogenous 
nucleic acid encoding a multifunctional geranylgeranyl pyrophosphate synthase. 

71. The membranous bacteria of claim 70, wherein the amino acid sequence of said 

30 multifunctional geranylgeranyl pyrophosphate synthase is at least 90% identical to the 

amino acid sequence of SEQ ID NO:45. 
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72. The membranous bacteria of claim 69, wherein said membranous bacteria are a 
species of Rhodobacter . 

73 . A „ isolated nucleic acid having a, leas. 60% sequence identity to the ^.eoude 
sequences of SEQ ID NO:38, or .0 a fragment of the nuc.erc acrd of SEQ ID N0.38 
least 15 contiguous nucleotides in length. 

identity to the nucleotide sequences of SEQ ID NO:38, or .0 a fragment of the nucle.e 
acid of SEQ ID NO:38 at least 15 contiguous nucleotides in length. 

75 The isolated nucleic aeid of claim 73, said nucleic acid having at leas, 90% sequence 
' identity ,0 me nucleotide sequences of SEQ ID NO:38, o, to a fragment of me nucle.e 
acid of SEQ IDNO:38 at least 15 contiguous nucleotides in length. 

76. The isolated nucleic aeid of eteim 73, wherein said nucleic aeid encodes a 5-caro.ene 
C4 oxygenase. 

77. Membranous bacteria comprising an exogenous nucle.c acid encoding a 
C4 oxygenase, said p-caro«ene oxygenase having an ammo acid sequence at leas, 80/. 
identical to the amino acid sequence of SEQ ID NO:39. 

78 A hos, cell comprising a, exogenous nucleic aeid, wherein the exogenous nucleic acid 
, 5 comprises a nucleic aeid sequence encoding one or more polypeptides that catalyze 

the formation of (3S, 3'S) astaxanthm, wherein tire hos, cell produces CoQ-10 and 
(3S, 3'S) astaxanthin. 

79 A method of making CoQ-10 and (3S, 3'S) astaxanthin at substantially the same time, 
30 the method comprising transforming a host cell with a nucleic acid, wherein the 

nucleic acid comprises a nucleic acid sequence that encodes one or more 
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polypeptides, wherein* the polypeptides catalyze the formation of (3S, 3'S) 
astaxanthin; and culturing the host cell under conditions that allow for the production 
of (3S, 3'S) astaxanthin and CoQ-10. 

5 80. The method of claim 79, additionally comprising transforming the host cell with at 
least one exogenous nucleic acid, the exogenous nucleic acid encoding one or more 
polypeptides, wherein the polypeptides catalyze the formation of CoQ-10. 

81. An isolated nucleic acid having a nucleotide sequence selected from the group 

10 ' consisting of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID 

NO:9, SEQ ID NO: 11, SEQ ID NO:38, and SEQ ID NO:44. 

82. An isolated nucleic acid having at least 90% sequence identity to the nucleotide 
sequences of SEQ ID NO:44, or to a fragment of the nucleic acid of SEQ ID NO:44 at 

15 least 60 contiguous nucleotides in length. 

83. A method of making geranylgeranyl pyrophosphate, said method comprising 
contacting isopentenyl pyrophosphate and dimethylallyl pyrophosphate with a 
polypeptide encoded by the isolated nucleic acid of claim 82. 

20 

84. A method of making geranylgeranyl pyrophosphate, said method comprising 
contacting farnesyl pyrophosphate and isopentenyfpyrophosphate with a polypeptide 
encoded by the isolated nucleic acid of claim 15 or the polypeptide of claim 20. 

25 85. A method of making P-carotene, said method comprising contacting lycopene with a 
polypeptide encoded by the isolated nucleic acid of claim 8 or the polypeptide of 
claim 14. 

86. A method of making lycopene, said method comprising contacting phytoene with a 
30 polypeptide encoded by the isolated nucleic acid of claim 21 or the polypeptide of 

claim 26. 
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87 Amethod of making phytoene, said method comprising contacting geranylgeranyl 

' pyrophosphate with a polypeptide encoded by the isolated nucleic acid of clann 27 or 
the polypeptide of claim 32. 

88 A method of making zeaxanthin, said method comprising contacting p-caro,ene with a 
' p „,ypeptide encoded by the isolated nucleic acid of claim 33 or the polypepttde of 

claim 38. 

89 A method of making can.haxan.hin, said method comprising con.ac«ing p-carotene 
with a polypeptide encoded by the isolated nucleic acid of claim 73 or a polypepttde 
having an ammo acid sequence a, leas. 80% identical to the amino acid sequence of 
SEQIDNO-.39. 

90 A method of making astaxanthin, said method comprising contacting can.haxanthin 
with a polypeptide encoded by the isolated nucleic acid sequence of clatm 33 or the 
po ly peptide of claim 38. 

91 A method of making astaxanthin, said method comprising contacting zeaxanthin with 
a polypeptide encoded by the isolated nucleic aeid sequence of clatm 73 or a 
polypeptide having an amino acid sequence a. leas, 80% identical to the amino actd 
sequence of SEQ ID NO:39. 
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SEQUENCE LISTING 

<110> Cargill, Incorporated 
<120> Carotenoid Biosynthesis 

<130> 12794-004WO1 

<150> US 60/288,984 
<151> 2001-05-04 

<150> OS 60/264,329 
<151> 2001-01-26 

<160> 47 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1296 
<212> DNA 

<213> Pantoea stewartii 



<400> 1 tt-rttcaacc atgttcgcgc tctgcaaaac 

atgagccatt ttgcggtgat cgcaccgccc "tttcagcc g g ^ acatgactgc 
cttgctcagg aattagtggc -gcggtcat cgtgttacgt ^ aacgcatcct 

aaagcgctgg taacgggcag ^^atcgga J cactcggacc ctcgatgtta 

cccggttcct tatcgcacct gctgcacctg g g J gccgggaact gcccgccgct 

cgactgatca atgaaatggc acgtacc gc ^atgcttt J J99 aggt gcagta 
tttcatgcgt tgcagataga «9cgtg«tc J«gatca«a gg g gctcaaccgc 
gtcgcagaag cgtcaggtct ^cgtttgtt tcggtggc JJJ tgcggc tcgg 
gaaccgggtt tgcctctggc 9^gatgcct ttcg g g a cgat cgtgtg 

gaacgctata ccaccagcga aaaaatttat gactgg y a ^ aaactgca tcattgtttt 
atcgcgcatc atgcatgcag "tgggttta 9«ccgcgtg ccccg caaagcgctg 

tctccactgg cacaaatcag ^agttgatc aggggacgcc ggggtcatca 

ccagactgct ttcatgcggt tggaccgtta =99 cctcgc tggg caccctgcag 

acttcttatt ttccgtcccc ggacaaaccc cgta « gcgaa gaggt ggatgcgcag 

ggacatcgtt atggcctgtt caggaccatc ^caaagcct g g g ggcccggggc 
??actgttgg cacactgtgg cggcctctca ^acgcagg cactttcaca ggcacagttg 
ggggacattc aggttgtgga ttttgccgat caatccgc g cacaccgcta 
acaatcacac atggtgggat gaatacgg tt atcatggc 

ftSSSS SS=S2 Sijjcjc 

SSSS: 55SS g ? «gg cgatUac ctgtcagcca £0 
gtactcagtg ggcaggatta tgcaaccgca ctatga 



<210> 2 
<211> 431 
<212> PRT 

<213> Pantoea stewartii 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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Ala Leu Gin Asn Leu Ala Gin Glu Leu Val Ala Arg Gly His Arg Val 

20 25 30. 

Thr Phe Phe Gin Gin His Asp Cys Lys Ala Leu Val Thr Gly Ser Asp 

35 40 45 

He Gly Phe Gin Thr Val Gly Leu Gin Thr His Pro Pro Gly Ser Leu 

50 55 60 

Ser His Leu Leu His Leu Ala Ala His Pro Leu Gly Pro Ser Met Leu 
65 70 75 80 

Arg Leu He Asn Glu Met Ala Arg Thr Ser Asp Met Leu Cys Arg Glu 

85 90 95 

Leu Pro Ala Ala Phe His Ala Leu Gin He Glu Gly Val He Val Asp 

100 105 HO 

Gin Met Glu Pro Ala Gly Ala Val Val Ala Glu Ala Ser Gly Leu Pro 

115 ^ 120 125 

Phe Val Ser Val Ala Cys Ala Leu Pro Leu Asn Arg Glu Pro Gly Leu 

130 • " 135 140 

Pro Leu Ala Val Met Pro Phe Glu Tyr Gly Thr Ser Asp Ala Ala Arg 
145 150 155 160 

Glu Arg Tyr Thr Thr Ser Glu Lys He Tyr Asp Trp Leu Met Arg Arg 

165 170 175 

His Asp Arg Val He Ala His His Ala Cys Arg Met Gly Leu Ala Pro 

180 185 190 

Arg Glu Lys Leu His His Cys Phe Ser Pro Leu Ala Gin He Ser Gin 

195 200 205 

Leu lie Pro Glu Leu Asp Phe Pro Arg Lys Ala Leu Pro Asp Cys Phe 

210 215 220 

His Ala Val Gly Pro Leu Arg Gin Pro Gin Gly Thr Pro Gly Ser Ser 
225 230 235 240 

Thr Ser Tyr Phe Pro Ser Pro Asp Lys Pro Arg He Phe Ala Ser Leu 

245 250 255 

Gly Thr Leu Gin Gly His Arg Tyr Gly Leu Phe Arg 'Thr He Ala Lys 

260 265 270 

Ala Cys Glu Glu Val Asp Ala Gin Leu Leu Leu Ala His Cys Gly Gly 

275 280 285 

Leu Ser Ala Thr Glh Ala Gly Glu Leu Ala Arg Gly Gly Asp He Gin 

290 295 300 

Val Val Asp Phe Ala Asp Gin Ser Ala Ala Leu Ser Gin Ala Gin Leu 
305 ~ * 310 315 320 

Thr He Thr His Gly Gly Met Asn Thr Val Leu Asp Ala He Ala Ser 

325 330 335 

Arg Thr Pro Leu Leu Ala Leu Pro Leu Ala Phe Asp Gin Pro Gly Val 

34 0 34 5 350 

Ala Ser Arg He Val Tyr His Gly He Gly Lys Arg Ala Ser Arg Phe 

355 360 365 

Thr Thr Ser His Ala Leu Ala Arg Gin He Arg Ser Leu Leu Thr Asn 

370 375 380 

Thr Asp Tyr Pro Gin Arg Met Thr Lys He Gin Ala Ala Leu Arg Leu 
385 ^ 390 395 400 

Ala Gly Gly Thr Pro Ala Ala Ala Asp He Val Glu Gin Ala Met Arg 

405 410 415 

Thr Cys Gin Pro Val Leu Ser Gly Gin Asp Tyr Ala Thr Ala Leu . 

420 425 430 

<210> 3 
<211> 1149 
<212> DNA 

<213> Pantoea stewartii 
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3 



ssslu .««,«« r «« ? f t < iiiiziiit imziii ill™ »j 
™-r- =t c-sif 

tggatagcgc cgcttgtggt «atcactgg cc g tttcgc cgggatactc 300 
cgtcgccatg tgaacagtgg ctactactgc tttcagccgt tcatgctgaa 

cggcaacagt ttggacaaca "tatggctg ^accgcgg \^ cggacgggg t 

tcggtccagt tagcggatgg ccggattatt "tgccag y y gga gtggcaa 

tacacgcctg attctgcact acgcgtagga ttcc.ggcat ^ cgatcagcaa 

ctgagcgcgc cgcatggttt atcgtcaccg ^tatcatgg * * gatcgaagac 

aa?gctacc gctttgttta taccctgccg '"tccgcaa ccgc g cattcgcgat 660 

acacactaca ttgacaaggc taatcttcag ^cgaacggg g g 9^ gggtgcattg 720 

tatgctgcgc gacagggttg J«gtt«c.g ££^mc agcaaccgca agcctgtagc 780 

cccattacgt taacgggcga taatcgtcag "ttgg c ^ ctaccgct cgcggtggcg 

ggattacgcg ccgggctgtt tcatccgaca a cggctact a ccagaC gatt 

Sggccgatc gtctcagcgc jctgg.tgtg tttacctctt J gaatcgcatg 

gctcactttg cccagcaacg ttggcag aa cagggg ctatggctta 

ttgtttttag ccgg.ccggc cg.gtc.egc tggcgtg 9 ^ ^ tcggctaC gc 

£Sc£3 ?-gcggca t tgcaggcaat tatgacgact 1140 



catcgttga 

<210> 4 
<211> 382 
<212> PRT 

<213> Pantoea stewartii 



<400> 4 t no T^n Val Glv Ala Gly Leu Ala Asn 

Met Gin Pro His Tyr Asp Leu lie Leu Val Gly ^ 

1 5 r^r, nn His Pro Asp Met Arg He 

Gly Leu He Ala Leu Arg Leu Gin Gin Gin His P ^ 

20 , ^ Prn riu Ala Gly Gly Asn His Thr Trp Ser 
Leu Leu He Glu Ala Gly Pro Glu Ala biy y ^ 

Phe » i5 s. -i. - ~ - «■ Gi - His r tip ue Ala pro 

L9U f al His His Trp L MP * «n ™ « 

65 1° n« Tvr Tvr Cvs Val Thr Ser Arg His Phe 

Arg Arg His Val Asn Ser Gly Tyr Tyr cys ^ 

V rm nn Phe Gly Gin His Leu Trp Leu His Thr 
Ala Gly He Leu Arg Gin Gin Phe bJ.y v» ^ 

R1 , Val s« S VI His S J - ».l «1» Leu £. «P Oly «, 

Ile £. ». M. Thr V,X II. MP «■ «y Tyr THr Pro Asp 

130 , nn ala Phe He Gly Gin Glu Trp Gin 

Ser Ala Leu Arg Val Gly Phe Gin Ala Phe lie by ^ 

145 150 „ c „ P7 . n Ti e He Met Asp Ala Thr 

Leu Ser Ala Pro His Gly Leu Ser Ser Pro He 175 

n . »t_ ] rp..^ Thr Leu Pro Leu Ser 
Val Asp Gin Gin Asn Gly Tyr Arg Phe Val Tyr Thr 

180 mw u ; e Tur Tl<= Asd Lys Ala Asn 

Ala Thr Ala Leu Leu He Glu Asp Thr Hrs Tyr He Asp 

195 , en tip Ara Asp Tyr Ala Ala Arg 

Leu Gin Ala Glu Arg. Ala Arg Gin Asn He Arg Asp iy 

210 215 



360 
420 
480 
540 
600 
660 



840 
900 
960 
1020 
1080 
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4 



Gin 


Gly 


Trp 


Pro 


Leu 


Gin 


Thr 


Leu 


Leu 


Arg 


Glu 


blU 


bin 


biy 


Hid 


Leu 


225 






230 






















Pro 


He 


Thr 


Leu 


Thr 
2 4 5 


Gly 


Asp 


Asn 


Arg 


Gin 

OCA 


Pne 


Trp 


bin 


bin 


bin 

9 ^ ^ 


rlO 


Gin 


Ala 


Cys 


Ser 


Gly 


Leu 


Arg 


Ala 


bly 


Leu 


rne 


HIS 


Pro 


i nr 


inr 


vji y 






260 










265 










270 






Tyr 


Ser 


Leu 


Pro 


Leu 


Ala 


Val 


Ala 


Leu 


Ala 


Asp 


Arg 


Leu 


C a -y- 

ber 


t\± a 


Leu 




275 










o o o 










O Q C. 
Z O D 








Asp Val 


Phe 


Thr 


Ser 


Ser 


Ser 


Val 


T J ' ~ 

HIS 


bin 


I nr 


lie 


ai a 


nlS 


rne 






290 










295 










300 










Gin 


Gin 


Arg 


Trp 


Gin 


Gin 


Gin 


Gly 


Pne 


pne 


Arg 


we <_ 


Leu 


Asn 


Arg 




305 








310 










315 










320 


Leu 


Phe 


Leu 


Ala 


Gly 
325 


Pro 


Ala 


Glu 


Ser 


Arg 
330 


Trp 


Arg 


vai 




bin 

335 


Arg 


Phe 


Tyr 


Gly 


Leu 


Pro 


Glu 


Asp 


Leu 


He 


Ala 


Arg 


Phe 


Tyr 


Ala 


Gly 


Lys 




340 










345 










350 






Leu 


Thr 


Val 
355 


Thr 


Asp 


Arg 


Leu 


Arg 
360 


He 


Leu 


Ser 


Gly 


Lys 
365 


Pro 


Pro 


Val 


Pro 


Val 
370 


Phe 


Ala 


Ala 


Leu 


Gin 
375 


Ala 


He 


Met 


Thr 


Thr 
380 


His 


Arg 







<210> 5 
<211> 912 
<212> DNA 

<213> Pantoea stewartii 
<400> 5 

atgatggtct gcgcaaaaaa 
gctgatatcg atagccgcct 
ggtgccgcga tgcgtgaagg 
ttattaacag cgcgcgatct 
tgcgcggttg aaatggtgca 
gatgcgcaga tgcgtcgggg 
attctggcgg cggtcgcttt 
ctgacgccga tagccaaaac 
ggtctggttc agggccagtt 
gccatactgc taaccaatca 
gcgtccattg cggccaacgc 
gatctcggcc aggcctttca 
aaagacatca atcaggatgc 
gtcgaagaac gcctgcgaca 
caaaacggcc attccaccac 
gccgtcagtt aa 

<210> 6 

<21I> 303 

<212> PRT 

<213> Pantoea stewartii 

<400> 6 

Met Met Vai Cys Ala Lys Lys His Val His Leu Thr Gly He Ser Ala 

15 10 15 

Glu Gin Leu Leu Ala Asp He Asp Ser Arg Leu Asp Gin Leu Leu Pro 

20 25 30 

Vai Gin Gly Glu Arg Asp Cys Val Gly Ala Ala Met Arg Glu Gly Thr 

35 40 45 

Leu Ala Pro Gly Lys Arg He Arg Pro Met Leu Leu Leu Leu Thr Ala 



acacgttcac cttactggca tttcggctga gcagttgctg 60 

tgatcagtta ctgccggttc agggtgagcg ggattgtgtg 120 

cacgctggca ccgggcaaac gtattcgtcc gatgctgctg 180 

tggctgtgcg atcagtcacg ggggattact ggatttagcc 240 

tgctgcctcg ctgattctgg atgatatgcc ctgcatggac 300 

gcgtcccacc attcacacgc agtacggtga acatgtggcg 360 

actcagcaaa gcgtttgggg tgattgccga ggctgaaggt 420 

tcgcgcggtg tcggagctgt ccactgcgat tggcatgcag 480 

taaggacctc tcggaaggcg ataaaccccg cagcgccgat 540 

gtttaaaacc agcacgctgt tttgcgcgtc aacgcaaatg 600 

gtcctgcgaa gcgcgtgaga acctgcatcg tttctcgctc 660 

gttgcttgac gatcttaccg atggcatgac cgataccggc 720 

aggtaaatca acgctggtca atttattagg ctcaggcgcg 780 

gcatttgcgc ctggccagtg aacacctttc cgcggcatgc 840 

ccaacttttt attcaggcct ggtttgacaa aaaactcgct 900 



ENSDOCIO: <WO 02079395A2_IA> 
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50 », . <; P r His Glv Gly Leu Leu Asp Leu Ala 

Arg Asp Leu Gly Cys Ala He Ser His Giy 8Q 

11, Ma va! «lo »« «1 «^ |« - «~ " P 

Pro cys ».t »» P Up «. »« w «• = iy pro is Ile His 

Thr cm , yr S°» Ms vai «. ii. - «• - S Le " 

„ r Ly s iS «- Sly v. XI. Ma Olu Ma «lu CXv «. Thr Pro .1. 
». Ill Thr «g Ma VI Ser Glu Leu Ser Thr Ma He Gly H.t «; 

HI L , a val Gin .1, «n Phe x ys A»P X.u Ser OX. Gl y «p ly. « 
.„ Ma MP Ma He Leu Leu Thr MP Gin P.. Ly. Thr s.r Thr 

L5U Ph , CyS Me S.r Thr Gl„ f x Ma S.r XX. Ma M. MP "a Ser 

195 • -jv DKo c-r- Leu Asp Leu Gly Gin 

Cys Glu Ala Arg Glu Asn Leu His Arg Phe Ser Leu Asp 

210 I t=„ Thr rsd Glv Met Thr Asp Thr Gly 

Ala Phe Gin Leu Leu Asp Asp Leu Thr Asp Gly 24Q 

III *ap XX. Me Gin Z M. Gly L y s Ser X,, Leu ».l M„ Leu Leu 

B , S.r Gl y Ma Val Gl« GXu M, Leu «, Gin HXs Leu «, Leu Ma 

Sar Gle Ms £S ser Ma Ma cy. Gin M„ Gly Ms Ser Thr Thr Gin 

Leu Ph. lie Gin Ma ftp Phe MP L y s L y s Leu M- Ma v.1 Ser 



290 

<210> 7 
<211> 1479 
<212> DNA 

<213> Pantoea stewartii 



.fgaaaccaa ctacggtaat tggtgcgggc tttgjtjjcc tjjcjctjjc ..ttcjttt. 60 
caggccgcag gtattcctgt "tgctgctt ^gcagcgcg ac^g caccg 1B0 

tatgtttatc aggagcaggg ctttactttt gatgc gy tta cg tcgagctg 240 

agcgcgattg aagaactgtt tgctctggcc ^taaacagc caa ttacgat 300 

t?g 9 ccgg t ca cgccgtttta tegcctgtgc tgggagtccg ^gg tgttgcgggt 3 60 

aacgaccagg cccagttaga agcgcagata agggctatct gaagctcggc 

fatigagcgt tccttgacta ttcgcgtgcc ^attcaatg ^ ggcaaagctg 
actgtgcctt ttttatcgtt -aagacatg c«cggg g t J , gcat cttcgg 

caggcatggc gcagcgttta «gtaaagtt g ^ cgt ttgcaac ctcgtccatt 

caggcgtttt cttttcactc g ctct "^ a ?aaqqcgtct ggtttccacg cggtggaacc 
tatacgctga ttcacgcgtt agaacgggaa Jggggcgtc 99 agtcgtgctt 
ggtgcgctgg tcaatggcat 9-tc.agctg "tcaggatc ^9 9^ gcagt a 
aacgcccggg tcagtcatat 99 a aaccgtt gggg « ct gatgttgt acatacctat 
gacggcagac ggtttgaaac ctgcgcggtg ^gtcgaa eg g y gcaatccaag 
9 cgcga t =tgc tgtctcagca tcccgcagcc jcta.gc.gg Scatcatca cgatcaactc 
cgtatgagta actcactgtt tgtactctat "tgg tgatt cacga aatttttaac 

gcccatcata ccgtctgttt tgggccacgc taccgtga c J ccttgtgt cacggatcC g 
catgatggtc tggctgagga tttttcgett cgcctgttcc acacttaggc 

tcactggcac eggaagggtg eggcagctat tatgrgc gg 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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1200 
1260 
1320 
1380 
1440 
1479 



acggcgaacc tcgactgggc ggtagaagga 


ccccgactgc 


gegategtat ttttgactac 


cttgagcaac attacatgcc tggcttgcga 


agccagttgg 


tgacgcaccg tatgtttacg 


ccgttcgatt tccgcgacga gctcaatgcc 


tggcaaggtt 


cggccttctc ggttgaacct 


attctgaccc agagcgcctg gttccgacca 


cataaccgcg 


ataagcacat tgataatctt 


tatctggttg gcgcaggcac ccatcctggc 


gcgggcattc 


ccggcgtaat cggctcggcg 


aaggcgacgg caggcttaat gctggaggac 


ctgatttga 












<210> 8 






























<211> 492 




























<212> PRT 




























<213> Pantoea stewartii 






















<4 00> 8 






























Mpt" T \7 c; 


L X \J 


Thr 


Thr 


Val 


He 


Gly 


Ala 


Glv 


Phe 


Gly 


Gly 


Leu 


Ala 


Leu 


1 






5 










10 










15 




Si a Tin 

Mxa lie 


Arg 


Leu 


Ulil 


a 1 a 


Ala 


Gly 


He 


Pro 


Val 


Leu 


Leu 


Leu 


Glu 


Gin 




? n 




















30 






7\ "r~ /-r 7\ c- r^i 

niy i-\sp 


Ly s 


Pro 


fZ 1 \7 




Ar g 


A 1 ^ 


i yx 


Val 


Tvr 
i y x 


Gin 


Glu 


Gin 


Gly 


Phe 




J D 










4 o 










45 








Thr Pho 

i nr rr ne 


Asp 


r\± d 


fly 

*j j. y 


IT X U 


1 I IX 


val 


I1C 


Thr 


Asp 


Pro 


Ser 


Ala 


He 


Glu 


SO 










D D 










60 










fZ 1 n Toil 




A 1 A 
/*i.x ct 


Leu 


Ala 


o x y 


Lys 




T on 


.by o 


Asp 


Tyr 


Val 


Glu 


Leu 


65 








7 0 










7 5 










80 


Leu Pro 


Val 


Thr 


Pro 


Phe 


Tyr 


Arg 


7 oil 


^yo 




Glu 


Ser 


Gly 


Lys 


Val 








ft s 










90 










95 




Phe Asn 


Tyr 


Asp 


As n 




Gin 


Ala 


Gin 


Leu 


Glu 


Ala 


Gin 


He 


Gin 


Gin 




t no 

X \J \J 










i n s 

1 \J vj 










110 






Phe Asn 


Pro 


Ar g 




val 


A "I ^ 


oiy 


i yr 


Arg 


Ala 


Phe 


Leu 


Asp 


Tyr 


Ser 




115 










X £. \J 










125 








Arg Ala 


Val 


Pho 

rr ne 


Asn 


X LI 


\j± y 


T \r y* 

i yr 


Toil 
lit- U 


x»y «-> 


T .on 

XjC U 


Gly 


Thr 


Val 


Pro 


Phe 


130 




















140 










Leu Ser 


Phe 


jjy s 


Asp 


l ¥ JC L. 


Leu 


Arg 


AT ^ 


Ala 


Pro 


Gin 


Leu 


Ala 


Lys 


Leu 


145 






X Z) \J 










X -> *J 










160 


Gin Ala 


Trp 


Arg 


Cor 


Val 


i yr 


Q o r* 
OCX 


T .\7 Q 

Xiy o 


Val 


Ala 


Gly 


Tyr 


He 


Glu 


Asp 








±\J ID 










170 










175 




Glu His 


Leu 


Arg 


n n 


AT ^ 

.rt.X d 


Pho 
rue 


C? q y- 

Ot: 1 


Phe 


His 


Ser 


Leu 


Leu 


Val 


Gly 


Gly 






JL O U 










185 










190 






Asn Pro 


Phe 


AJ_a 


i nr 


Qo -r- 

oer 


Qo r- 


Tl * 
X X. 


lyr 


1 ill 


Leu 


He 


His 


Ala 


Leu 


Glu 




195 










? n n 










205 








Arg Glu 


Trp 


Gly 


Val 


Trp 


it ne 


Pro 


Arg 


CZ 1 \7 

o xy 


uiy 


Thr 


Gly 


Ala 


Leu 


Val 


210 










1 w> 










220 










Asn Gly Met 


He 


Lys 


Leu 


ir ne 


fin 


Asp 


L S U 


oiy 


Gly 


Glu 


Val 


Val 


Leu 


225 








230 










/ jj 










240 


Asn Ala 


Arg 


Val 


Ser 


His 


Met 


Glu 


Thr 


Val 


Gly 


Asp 


Lys 


He 


Glri 


Ala 






245 










250 










255 




Val Gin 


Leu 


Glu 


Asp 


Gly 


Arg 


Arg 


Phe 


Glu 


Thr 


Cys 


Ala 


Val 


Ala 


Ser 






260 










265 










270 






Asn Ala 


Asp 


Val 


Val 


His 


Thr 


Tyr 


Arg 


Asp 


Leu 


Leu 


Ser 


Gin 


His 


Pro 




275 










280 










285 








Ala Ala 


Ala 


Lys 


Gin 


Ala 


Lys 


Lys 


Leu 


Gin 


Ser 


Lys 


Arg 


Met 


Ser 


Asn 


290 








295 










300 










Ser Leu 


Phe 


Val 


Leu 


Tyr 


Phe 


Gly 


Leu 


Asn 


His 


His 


His 


Asp 


Gin 


Leu 


305 








310 










315 










320 


Ala His 


His 


Thr 


Val 


Cys 


Phe 


Gly 


Pro 


Arg 


Tyr 


Arg 


Glu 


Leu 


He 


His 








325 








330 










335 




Glu lie 


Phe 


Asn 


His 


Asp 


Gly 


Leu 


Ala 


Glu 


Asp 


Phe 


Ser 


Leu 


Tyr 


Leu 






340 










345 










350 
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His 

Ser 

Asp 
385 
Leu 

Arg 

Gly 

Arg 

Ala 
465 
Lys 



Ala 

Tyr 
370 
Trp 

Glu 

Met 

Ser 

Pro 
450 
Gly 

Ala 



Pro Cys Val Thr Asp 
355 

Tyr Val 



Ala Val 

Gin His 

Phe Thr 
420 
Ala Phe 
435 

His Asn 
Thr His 
Thr Ala 



Leu Ala Pro 
375 

Glu Gly Pro 
390 

Tyr Met Pro 
405 

Pro Phe Asp 

Ser Val Glu 

Arg Asp Lys 
455 

Pro Gly Ala 
470 

Gly Leu Met 
485 



Pro Ser 
360 

Val Pro 

Arg Leu 

Gly Leu 

Phe Arg 
425 
Pro He 
440 

His He 



Leu Ala Pro 

His Leu Gly 
380 

Arg Asp Arg 
395 

Arg Ser Gin 
410 

Asp Glu Leu 
Leu Thr Gin 



Gly He 
Leu Glu 



Asp Asn Leu 
4 60 

Pro Gly Val 
475 

Asp Leu He 
4 90 



Glu Gly 
365 

Thr Ala 

He Phe 

Leu Val 

Asn Ala 
430 
Ser Ala 
445 

Tyr Leu 
He Gly 



Cys Gly 

Asn Leu 

Asp Tyr 
400 
Thr His 
415 

Trp Gin 

Trp Phe 

Val Gly 

Ser Ala 
480 



<210> 9 
<211> 893 
<212> DNA 

<213> Pantoea stewartii 



<400> 9 
ccatggcggt 
gtcgcagcgt 
aaacactggg 
agcttgaaat 
ccgcgtttca 
tggaaggttt 
gttattgcta 
gcgataacgc 
ttgcgcgtga 
tggaagagga 
gccgtatcgc 
gtctggcaca 
gtaaaattgg 
cgtccaccgc 
ggatgaagac 



tggctcgaaa 
gctgatgctt 
ctttcatgcc 
gaaaacgcgt 
ggaggtcgcg 
tgccatggat 
tcacgtcgcc 
cacgctcgat 
tattgtcgac 
aggactgacg 
cgggcgactg 
attaccctta 
cgtgaaagtt 
cgaaaaatta 
gtatccaccc 



agctttgcga 
tacgcatggt 
gaccagccct 
caggcctacg 
atggcgcatg 
gtgcgcgaaa 
ggtgttgtgg 
cgcgcctgcg 
gatgctcagg 
aaagcgaatt 
gtacgggaag 
cgctcggcct 
gaacaggccg 
acgcttttgc 
cgtcctgctc 



ctgcatcgac 

gccgccactg 

cttcgcagat 

ccggttcgca 

atatcgctcc 

cgcgctacct 

gcctgatgat 

atctcgggct 

tgggccgctg 

atgctgcgcc 

cggaacccta 

gggccatcgc 

gtaagcaggc 

tgacggcatc 

atctctggca 



gcttttcgac gccaaaaccc 
cgacgacgtc attgacgatc 
gcctgagcag cgcctgcagc 
aatgcacgag cccgcttttg 
cgcctacgcg ttcgaccatc 
gacactggac gatacgctgc 
ggcgcaaatt atgggcgttc 
ggctttccag ttgaccaaca 
ttatctgcct gaaagctg'gc 
agaaaaccgg caggccttaa 
ttacgtatca tcaatggccg 
gacagcgaag caggtgtacc 
ctgggatcat cgccagtcca 
cggtcaggca gttacttccc 
gcgcccgatc tag 



<210> 10 
<211> 296 
<212> PRT 

<213> Pantoea stewartii 

2? «ly S« L y s S„ Ph. «. ». ». Ser T.r ^ *=P 

aL Lys Thr Arg Arg Ser Val Leu Met Leu Tyr Ala Tr P Cys Arg His 
Cys Asp Asp Val He Asp Asp Gin Thr Leu Gly Phe His Ala As P Gin 
Pro Ser Ser Gin Met Pro Glu Gin Arg Leu Gin Gin Leu Glu Met Lys 
Thr Arg Gin Ala Tyr Ala Gly S,r Gin Met His Glu Pro Ala Phe Ala 
65 ™ 75 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
•893 



BNSDOCID <WO_ 



02079395A2JA> 



* WO 02/079395 PCT/US02/02124 



8 



Ala 


Phe 


Gin 


Glu 


Val 

o c 
O D 


Ala 


Met 


ai a 


HIS 


Asp 

QCi 
zJ U 


Tip 

lie 


Ala 


~ £ U 


AI a 


i yr 
95 


Al a 


Phe 


Asp 


His 


Leu 


Glu 


Gly 


Phe 


ax a 


Met 


Asp 


v ai 


TV 

Arg 


o 1 u 


Tk v- 
1 1 11 


nl y 


iyr 






100 










IUj 










1 i w 






Leu 


Thr 


Leu 
115 


Asp 


Asp 


Thr 


Leu 


Arg 
ion 


Tyr 


Cys 


Tyr 


nlS 


1 

Vdl 
-L Z. -J 


Ala 
nl ci 


C. 1 \/ 
o i y 


val 


Val 


Gly 
130 


Leu 


Met 


Met 


Ala 


bin 
1 Jo 


lie 


Met 


vai y 


Vdl 


M.JL y 

14U 


Asp 


no 1 1 


Ala 


Th r 


Leu 


Asp 


Arg 


Ala 


Cys 


Asp 


Leu 


Cjl y 


Leu 


Ala 


rne 


vj i n 


Leu 


1 1 1 XT 


Asn 


lie 


145 






150 










155 










160 


Ala 


Arg 


Asp 


lie 


Val 


Asp 


Asp 


Ala 


Gin 


val 




Arg 


Cys 


l yr 


Leu 


Pro 






1 65 










Tin 
1 / U 










1 1 ^ 
i / o 




Glu 


Ser 


Trp 


Leu 


Glu 


Glu 


Glu 


Gly 


Leu 


Thr 


Lys 


Ala 


Asn 


Tyr 


Aid 


ai a 






180 










1 o c 

lob 










IrU 






Pro 


Glu 


Asn 


Arg 


Gin 


Ala 


Leu 


Ser 


Arg 


lie 


Ala 


biy 


Arg 


Leu 


v a i 


Arg 






195 








o n n 










9 n s 








Glu 


Ala 
210 


Glu 


Pro 


Tyr 


Tyr 


val 

<L 1 D 


Ser 


Ca- 
ber 


LYier 


Aid 


uiy 

Z. Z. vj 


Leu 


Ala 




-Lit: Ll 


Pro 


Leu 


Arg 


Ser 


TV 1 -a 

Ala 


Trp 


ai a 


lie 


ni a 


Th r 


Ala 




Gin 


Val 


T vr 
j 


Arg 


225 








230 










235 










240 


Lys 


lie 


Gly 


Val 


Lys 


Val 


Glu 


bin 


Ai a 


iji y 


Lys 


oin 


Ala 


1 1 P 




ni 0 






245 










250 










255 




Arg 


Gin 


Ser 


Thr 


Ser 


Thr 


Ala 


Glu 


Lys 


Leu 


Thr 


Leu 


Leu 


Leu 


Thr 


Ala 






260 










265 










270 






Ser 


Gly 


Gin 


Ala 


Val 


Thr 


Ser 


Arg 


Met 


Lys 


Thr 


Tyr 


Pro 


Pro 


Arg 


Pro 




275 










280 










285 








Ala 


His 
290 


Leu 


Trp 


Gin 


Arg 


Pro 
295 


He 



















<210> 11 
<211> 528 
<212> DNA 

<213> Pantoea stewartii 



<400> 11 

atgttgtgga 

getgeactgg 

catgaaccgc 

gtgtcgattg 

gcaggcatga 

cgctggccgt 

egtatgeate 

ccaccgttat 

gecagagatg 



tttggaatgc 
cacataaata 
gtaaaggege 
ccctgattta 
ccgcttatgg 
tccgctacat 
atgctgtaag 
ctaaacttca 
ageaggaegg 



ectgategtg 
catcatgcac 
atttgaagtt 
etteggcagt 
tttactgtat 
accgcgcaaa 
gggaaaagag 
ggcgacgctg 
ggtggatacg 



tttgtcaccg 
ggctggggtt 
aacgatctct 
acaggaatct 
tttatggtcc 
ggctacctga 
ggctgcgtgt 
agagaaaggc 
tcttcatccg 



tggteggcat 
ggggctggca 
atgccgtggt 
ggccgctcca 
aegaeggact 
aacggttat a 
cctttggtt t 
atgeggctag 
ggaagtaa 



ggaagtggtt 
tctttcacat 
attcgccat,t 
gtggattggt 
ggtacaccag 
catggcccac 
tctgtacgcg 
ategggeget 



60 
120 
180 
240 
300 
360 
420 
480 
528 



<210> 12 

<211> 175 

<212> PRT 

<213> Pantoea stewartii 



<400> 12 

Met Leu Trp lie Trp Asn Ala Leu lie Val Phe Val Thr Val Val Gly 

15 10 15 

Met Glu Val Val Ala Ala Leu Ala His Lys Tyr lie Met His Gly Trp 

20 25 30 

Gly Trp Gly Trp His Leu Ser His His Glu Pro Arg Lys Gly Ala Phe 
35 4 0 4 5 



GNSDOCID: <WO 0207939&A2_IA> 



PCT/US02/02124 

WO 02/079395 



* t „ Tvr Ala Val Val Phe Ala He Val Ser He Ala 
Glu Val Asn Asp Leu Tyr Ala vai a ^ 

50 It rn, Tie Tro Pro Leu Gin Trp He Gly 

Leu lie Tyr Phe Gly Ser Thr Gly He Trp jro QQ 

S. Gly «- ™ Ma Tyr ,ly - - gr Phe «.t V,! His A f «y 
Leu His Cm S, Trp Pre » «, Tyr He Pre «, Lys Gly Tyr 

Leu Lys Ar, III Tyr He, Me His Ar, Me, His His Ale - * «* 
Ly s Gle III cys vsi Ser f e Giy PHe lee Tyr Me Pie Pro lee Ser 
Lys III Gle Ale THr lee Arg Gle Ar, His Ala Al. Ar, Ser Gly JXJ 
£ Ar, Asp Gie 31e Asp Giy v.l Asp TAr Ser ser ser Gly lys 



<210> 13 
<211> 29 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 13 

atyatgcacg gctggggwtg gsgmtggca 

<210> 14 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 14 

ggccarcgyt gatgcaccag mccgtcrtgc a 

<210> 15 
<2H> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 15 

ctgatgctct aygcctggtg ccgcca 

<210> 16 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



BMSDOCtD- <WO 0207939SA2 JA» 



WO 02/079395 PCT/U502/02124 

10 

<<300> 16 

tcgcgrgcra trttsgtcar ctg 23 

<210> 17 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 17 

atbmtsatgg aygcsacsgt 20 

<210> 18 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 18 

ytratcgarg ayacgcrcta 20 

<210> 19 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 19 

rsggcagyga atagccrgtg 20 

<210> 20 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 20 

aacagcatsc grttcagcak gcgsa 25 

<210> 21 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 21 



BNSOOCID: <WO 02079395A2_IA> 



WO 02/079395 



PCT/US02/02124 



ccgacggtka tcaccgatcc 

<210> 22 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 22 

ctgcgccsac caggtagag 

<210> 23 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 23 

ctygacgaya tgccctgcat ggac 

<210> 24 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 24 

gtcgatttwc csgcgtcctk attg 

<210> 25 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 25 

ggccgaattc caacgatgct ctggcagtta 

<210> 26 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 26 

ggccagatct acttcaggcg acgctgagag 



BNSOOCID- <WO 02079395A2_IA> 



WO 02/079395 



PCT/US02/02124 



12 

<210> 27 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 27 

ggccagatct tacgcgcggg taaagccaat 

<210> 28 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 28 

ggcctctaga attaccgcgt ggttctgaag 

<210> 29 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 29 

ggcctctaga tctgtacgcg ccaccgttat 

<210> 30 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 30 

catcggtaag atcgtcaagc aactgaa 

<210> 31 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 31 

gatttacctg catcctgatt gatgtct 

<210> 32 
<211> 27 



BNSDOCID: <WO 0207939 5A2_ I A> 



WO 02/079395 

13 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 32 

atgtataacc gtttcaggta gcctttg 

<210> 33 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 33 

aatacagtaa accataagcg gtcatgc 

<210> 34 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 34 

ttcatcatcg cgcatgac 

<210> 35 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<4O0> 35 

agrtgrtgyt cgtgrtga 

<210> 36 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 36 

gcggcatagg ctagattgaa g 

<210> 37 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



PCT/IJS02/02124 4 



27 



27 



18 



18 



BNSDOCID ,WO 02079395A2JA> 



WO 02/079395 PCT/US02/02124 

14 

<220> 

<223> Primer 



<400> 37 

gcgagttcct tctcacctat 

<210> 38 
<211> 735 
<212> DNA 

<213> Brevundimonas aurantiaca 



Met 


Thr 


Ala 


Ala 


Val 


Ala 


Glu 


Pro 


Arg 


Thr 


Val 


Pro 


Arg 


Gin 


Thr 


Trp 


1 








5 










10 










15 




lie 


Gly 


Leu 


Thr 


Leu 


Ala 


Gly 


Met 


He 


Val 


Ala 


Gly 


Trp 


Ala 


Val 


Leu 






20 










25 










30 






His 


Val 


Tyr 


Gly 


Val 


Tyr 


Phe 


His 


Arg 


Trp 


Gly 


Pro 


Leu 


Thr 


Leu 


Val 






35 








40* 










45 








He 


Ala 


Pro 


Ala 


He 


Val 


Ala 


Val 


Gin 


Thr 


Trp 


Leu 


Ser 


Val 


Gly 


Leu 




50 










55 










60 










Phe 


He 


Val 


Ala 


His 


Asp 


Ala 


Met 


Tyr 


Gly 


Ser 


Leu 


Ala 


Pro 


Gly 


Arg 


65 










70 










75 










80 


Pro 


Arg 


Leu 


Asn 


Ala 


Ala 


Val 


Gly Arg 


Leu 


Thr 


Leu 


Gly 


Leu 


Tyr 


Ala 








85 










90 










95 




Gly 


Phe 


Arg' 


Phe 


Asp 


Arg 


Leu 


Lys 


Thr 


Ala 


His 


His 


Ala 


His 


His 


Ala 






100 










105 










110 






Ala 


Pro 


Gly 


Thr 


Ala 


Asp 


Asp 


Pro 


Asp 


Phe 


His 


Ala 


Pro 


Ala 


Pro 


Arg 






115 










120 










125 








Ala 


Phe 


Leu 


Pro 


Trp 


Phe 


Leu 


Asn 


Phe 


Phe 


Arg 


Thr 


Tyr 


Phe 


Gly 


Trp 




130 










135 










140 










Arg 


Glu 


Met 


Ala 


Val 


Leu 


Thr 


Ala 


Leu 


Val 


Leu 


lie 


Ala 


Leu 


Phe 


Gly 


145 










150 










155 










160 


Leu 


Gly 


Ala 


Arg 


Pro 


Ala 


Asn 


Leu 


Leu 


Thr 


Phe 


Trp Ala 


Ala 


Pro 


Ala 






165 










170 










175 




Leu 


Leu 


Ser 


Ala 


Leu 


Gin 


Leu 


Phe 


Thr 


Phe 


Gly 


Thr 


Trp 


Leu 


Pro 


His 



20 



60 



<400> 38 

atgaccgccg ccgtcgccga gccacgcacc gtcccgcgcc agacctggat cggtctgacc 
ctqqcqqgaa tgatcgtggc gggatgggcg gttctgcatg tctacggcgt ctattttcac 120 
j 1 g 0 

240 
300 



cgatgggggc cgttgaccct ggtgatcgcc ccggcgatcg tggcggtcca gacctggttg 

tcggtcggcc ttttcatcgt cgcccatgac gccatgtacg gctccctggc gccgggacgg 

ccgcggctga acgccgcagt cggccggctg accctggggc tctatgcggg cttccgcttc 

gatcggctga agacggcgca ccacgcccac cacgccgcgc ccggcacggc cgacgacccg 360 

gattttcacg ccccggcgcc ccgcgccttc cttccctggt tcctgaactt ctttcgcacc 420 

tatttcggct ggcgcgagat ggcggtcctg accgccctgg tcctgatcgc cctcttcggc 480 

ctgggggcgc ggccggccaa tctcctgacc ttctgggccg cgccggccct gctttcagcg 540 

cttcagctct tcaccttcgg cacctggctg ccgcaccgcc acaccgacca gccgttcgcc 600 

gacgcgcacc acgcccgcag cagcggctac ggccccgtgc tttccctgct cacctgtttc 660 

cacttcggcc gccaccacga. acaccatctg agcccctggc ggccctggtg gcgtctgtgg 720 

cgcggcgagt cttga 1 ^ 

<210> 39 
<211> 244 • 
<212> PRT 

<213> Brevundimonas aurantiaca 
<400> 39 



180 185 190 



BNSDOCID: <WO 0207939 5A2JA> 



WO 02/079395 



PCT/US02/02124 



Arg His Thr Asp Gin 
195 

Gly Tyr Gly Pro Val 
210 

His His Giu His His 
225 

Arg Gly Glu Ser 



15 

Pro Phe Ala Asp Ala His 
200 

Leu Ser Leu Leu Thr Cys 
215 

Leu Ser Pro Trp Arg Pro 
230 235 



His Ala Arg Ser Ser 
205 

Phe His Phe Gly Arg 
220 

Trp Trp Arg Leu Trp 
2 40 



<210> 40 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 40 

ccaygaygay atwatgga 

<210> 41 
<211> 18 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 41 

yttyttvccy tycctaat 

<210> 42 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 42 

acagcgttgg acactcag 

<210> 43 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 43 

gcgtcgataa tggaagtgag 

<210> 44 
<2H> 1496 
<212> DNA 

<213> Sulfolobus shibatae 



BNSDOCIO <WO 02O79395A2_IA> 



WO 02/079395 



PCT/US02/02124 



16 



<400> 44 

ttaccagtgt 

ataatatttg 

aagaagacca 

aatttat eta 

gtattttaat 

tgttagaacg 

acccttaatt 

ggcagcagct 

tagggattac 

aatact tgea 

aggtcttgac 

aatatcggaa 

agagtacatg 

aggeggtata 

aaatctaggc 

ggaattaggg 

aaaaacttta 

tagggaggct 

geaatatgea 

aattccagtt 

tgaaaggaga 

cataagct at 

taaggatgta 

agccgggttt 

aattccat ct 



taaaaagtgc 
aattgaaggc 
gtattgttat 
tatacgagaa 
gatatagtta 
cttgaggaag 
ctggtttcat 
gccgtggaga 
ctaagaagag 
ggtgatt act 
gggaatacgt 
ggtcaagcaa 
caaatgataa 
attaacaagg 
atatcattcc 
aaaccagttt 
agtgaagcta 
aaaaaggacg 
tacaatttag 
tacaatgaaa 
aagtaaatga 
attacaacag 
aataaaccag 
atagegggat 
gtaatactct 



tatagaaggt 
cgccatgatg 
tttaggtaaa 
agttagaaag 
acaatgt aaa 
categtttea 
egtcagaett 
ttcttcacaa 
gattaccaac 
tacacgccaa 
tttataaggc 
tggatatgtc 
aaggaaagac 
etagegatga 
aaatagtgga 
acagtgatat 
ctgacgatga 
atcttgagag 
ctaaaaagta 
ctgctgaaaa 
gcatatcagg 
tctgggtaat 
ataaacegga 
ccttctcctt 
cctcattgct 



aaggaaagtt 

cttactggtt 

cttaaagagt 

agagaataaa 

ttttcatata 

tttatttaca 

aattggcggg 

ctttactcta 

tgttcatgta 

ggcttttgaa 

tttttccgta 

atttgaaaat 

tgegatgeta 

tataattaaa 

tgatatctta 

tagggaaggt 

aaagaaaatt 

agegteggaa 

ctcagatctt 

ggctttaaaa 

gatattgett 

aagacaggca 

aataccacta 

attactaact 

tatagcattt 



tagaacaatt 

etaaagaegt 

gggcagaata 

atgagtgacg 

aaaaattttg 

gctgggggca 

gacaggcaaa 

gttcatgacg 

aagtggggtg 

gecttaaatg 

tttattaatt 

agagtagatg 

ttttcatgtt 

aatttagtcg 

ggaattattg 

aagaaaacaa 

ctggtttcta 

ataataagga 

gcattagaac 

tatctagege 

tcaattttta 

aaaaagagtg 

atgggtggga 

gatgtaagaa 

cttggactat 



ctt tagaaag 
tgatgcgtta 
tagggggata 
aattaagttc 
taaagagcaa 
aaagacttag 
gggcatataa 
atataatgga 
aaccaatggc 
aggct ctaaa 
ctattgagat 
taactgagga 
ctgetgeatt 
aatatggatt 
gagaccaaaa 
ttcttgttat 
cgcttgggaa 
agtattcatt 
atttgegtaa 
agtttaccat 
tatccttttt 
ggcttgtagg 
taagtataat 
gtgagcgagt 
tagatg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1496 



<210> 45 
<211> 331 
<212> PRT 

<213> Sulfolobus shibatae 



<400> 45 



Met 


Ser 


Asp 


Glu 


Leu 


Ser 


Ser 


Tyr 


Phe 


Asn 


Asp 


He 


Val 


Asn 


Asn 


Val 


1 






5 










10 










15 




Asn 


Phe 


His 


He 


Lys 


Asn 


Phe 


Val 


Lys 


Ser 


Asn 


Val 


Arg 


Thr 


Leu 


Glu 








20 








25 










30 






Glu 


Ala 


Ser 


Phe 


His 


Leu 


Phe 


Thr 


Ala 


Gly 


Gly 


Lys 


Arg 


Leu 


Arg 


Pro 






35 










40 










45 








Leu 


He 


Leu 


Val 


Ser 


Ser 


Ser 


Asp 


Leu 


He 


Gly 


Gly 


Asp 


Arg 


Gin 


Arg 




50 










55 










60 










Ala 


Tyr 


Lys 


Ala 


Ala 


Ala 


Ala 


Val 


Glu 


He 


Leu 


His 


Asn 


Phe 


Thr 


Leu 


65 






70 










75 










80 


Val 


His 


Asp 


Asp 


He 


Met 


Asp 


Arg 


Asp 


Tyr 


Leu 


Arg 


Arg 


Gly 


Leu 


Pro 






85 










90 










95 




Thr 


Val 


His 


Val 


Lys 


Trp 


Gly 


Glu 


Pro 


Met 


Ala 


He 


Leu 


Ala 


Gly 


Asp 








100 








105 










110 






Tyr 


Leu 


His 


Ala 


Lys 


Ala 


Phe 


Glu 


Ala 


Leu 


Asn 


Glu 


Ala 


Leu 


Lys 


Gly 




115 










120 










125 








Leu 


Asp 


Gly 


Asn 


Thr 


Phe 


Tyr 


Lys 


Ala 


Phe 


Ser 


Val 


Phe 


He 


Asn 


Ser 




130 








135 










140 










lie 


Glu 


lie 


He 


Ser 


Glu 


Gly -Gin 


Ala 


Met 


Asp 


Met 


Ser 


Phe 


Glu 


Asn 


145 










150 










155 










160 


Arg 


Val 


Asp 


Val 


Thr 


Glu 


Glu 


Glu 


Tyr 


Met 


Gin 


Met 


lie 


Lys 


Gly 


Lys 






.165 










170 










175 




Thr 


Ala 


Met 


Leu 


Phe 


Ser 


Cys 


Ser 


Ala 


Ala 


Leu 


Gly 


Gly 


He 


He 


Asn 
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Lys se, HI mp .i. jj; «» - " l clu 25 Gly " u Rs " 

Leu 3Iy iS S« G I„ .X. «P MP - - OX, II. XL IV 

Rsp SS =lu « «xy £■ »o V,! Tyr s.r «p ne « 9 «u «» 

gi „. ™ :i« ^ "5 ii. «*. ~ ss s " Gl " T " - RSP 

Glu , y » L ys ll« « v.! s« T hr «. Cly - - ^ «*• 

RSP RSP ... s; - ss - ss s,r Leu Gln 

275 t t ,« Tvr Ser Asp Leu Ala Leu Glu His 

Ty r Ala Tyr Asn Leu Ala Lys Lys Tyr Ser Asp ^ 

290 , ™ B =„ rin Thr Ala Glu Lys Ala Leu Lys 

Leu Arg Lys lie Pro Val Tyr Asn Glu Thr Ala ^ 

Tyr Leu Ala Gin Phe Tnr He Glu Arg Arg Lys 



<210> 46 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Exemplary motif 
<400> 46 

aggtcgtgta ctgtcagtca 

<210> 47 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Exemplary motif 
<400> 47 

acgtggtgaa ctgccagtga 



20 



20 
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